AILab presents its research via reviewed scholarly conferences and journals. Here are its most recent contributions published over the period of previous eight years.
Journal publications:
- 2022:
- MÁLEK, J. et al., Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification, IEEE-ACM Transactions On Audio Speech And Language Processing, 2022, vol. 30, pp. 2295 – 2309.
- 2021:
- ČERVA, P. et al., Identification of related languages from spoken data: Moving from off-line to on-line scenario, Computer Speech and Language, Elsevier, 2021, vol. 68.
- MATĚJŮ, L. et al., An Empirical Assessment of Deep Learning Approaches to Task-Oriented Dialog Management, Neurocomputing, Elsevier, 2021, vol. 439, pp. 327 – 339.
- 2020:
- MÁLEK, J., KOLDOVSKÝ, Z. a BOHÁČ, M., Block-online Multi-channel Speech Enhancement Using Deep Neural Network-supported Relative Transfer Function Estimates, IET Signal Processing, 2020, vol. 14, pp. 124 – 133.
- 2018:
- PALEČEK, K.: Experimenting With Lipreading For Large Vocabulary Continuous Speech Recognition. In Journal on Multimodal User Interfaces, Special Issue on Speech Communication Integrated with other Modalities, Volume: 12, Issue: 4, 2018, pp: 309-318.
- 2017:
- BORSKÝ, M., MIZERA, P., POLLÁK, P., NOUZA, J., Dithering techniques in automatic recognition of speech corrupted by MP3 compression: Analysis, solutions and experiments, Speech Communication, 2017, vol. 86, pp. 75–84.
Conference publications:
- 2023:
- CHALOUPKA, J., PALEČEK, K., Audio-Visual Broadcast Transcription System in the Era of Covid-19. Proceedings of 46th International Conference on Telecommunications and Signal Processing, TSP 2023, 2023-7-12, p. 276-279.
- KYNYCH, F., ŽĎÁNSKÝ, J., ČERVA, P. a MATĚJŮ, L., Online Speaker Diarization Using Optimized SE-ResNet Architecture, 26th International Conference on Text, Speech and Dialogue, TSD 2023, 2023, pp. 176 – 187.
- MATĚJŮ, L., NOUZA, J., ČERVA, P., ŽĎÁNSKÝ, J., KYNYCH, F., Combining Multilingual Resources and Models to Develop State-of-the-Art E2E ASR for Swedish, Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech 2023, 2023, pp. 3252 – 3256.
- NOUZA, J., MATĚJŮ, L., ČERVA, P., ŽĎÁNSKÝ, J., Developing State-of-the-Art End-to-End ASR for Norwegian, 26th International Conference on Text, Speech and Dialogue, TSD 2023, 2023, pp. 200–213.
- POLÁČEK, M., ČERVA, P., ŽĎÁNSKÝ, J. a WEINGARTOVÁ, L., Online Punctuation Restoration using ELECTRA Model for streaming ASR Systems, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2023, 2023, pp. 446 – 450.
- 2022:
- MÁLEK, J., ČMEJLA, J. a KOLDOVSKÝ, Z., Blind Extraction of Target Speech Source: Three ways of Guidance Exploiting Supervised Speaker Embeddings, IWAENC 2022, 2022.
- MATĚJŮ, L. et al., Overlapped Speech Detection in Broadcast Streams Using X-vectors, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 2022, pp. 4606 – 4610.
- NOUZA, J., ČERVA, P. a ŽĎÁNSKÝ, J., Lexicon-based vs. Lexicon-free ASR for Norwegian Parliament Speech Transcription, International Conference on Text, Speech, and Dialogue 2022, 2022, pp. 401 – 409.
- 2021:
- ČERVA, P. et al., Identification of Scandinavian Languages from Speech Using Bottleneck Features and X-vectors, In proceedings: Text, Speech, and Dialogue, 2021, pp. 371 – 381.
- CHALOUPKA, J., PALEČEK, K. a ČERVA, P. Audio-visual Broadcast Transcription System Using Artificial Neural Networks, 2021 IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics, ECMSM 2021 IEEE, 2021.
- MÁLEK, J. et al., Blind extraction of moving audio source in a challenging environment supported by speaker identification via X-vectors, ICASSP 2021, 2021, pp. 226 – 230.
- MATĚJŮ, L. et al., Using X-vectors for Speech Activity Detection in Broadcast Streams, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 2021, pp. 4161 – 4165.
- PALEČEK, K. a CHALOUPKA, J. Logo Detection and Identification in System for Audio-Visual Broadcast Transcription, 44th International Conference on Telecommunications and Signal Processing IEEE, 2021 pp. 357 – 360.
- 2020:
- ČERVA, P., VOLNÁ, V. a WEINGARTOVÁ, L., Dealing with Newly Emerging OOVs in Broadcast Programs by Daily Updates of the Lexicon and Language Model, 22nd International Conference on Speech and Computer, SPECOM 2020, 2020, pp. 97 – 107.
- CHALOUPKA, J., Audio-Visual TV Broadcast Signal Segmentation Advances in Intelligent Systems and Computing, 6th International Conference on Man-Machine Interactions, Germany: Springer, 2020 pp. 221 – 228.
- CHALOUPKA, J., PALEČEK, K., ČERVA, P. a ŽĎÁNSKÝ, J., Optical Character Recognition for Audio-Visual Broadcast Transcription System, 11th IEEE International Conference on Cognitive Infocommunications, CogInfoCom 2020, Finland, IEEE, 2020, pp. 229 – 232.
- MÁLEK, J. a ŽĎÁNSKÝ, J., Voice-activity and overlapped speech detection using x-vectors, 23rd International Conference on Text, Speech, and Dialogue, TSD 2020, 2020, pp. 366 – 376.
- NOUZA, J., ČERVA, P. a ŽĎÁNSKÝ, J., Very Fast Keyword Spotting System with Real Time Factor below 0.01, 23rd International Conference on Text, Speech, and Dialogue, TSD 2020, 2020, pp. 426 – 436.
- PALEČEK, K. a CHALOUPKA, J., Pet immunity for PIR sensors using deep learning, 43rd International Conference on Telecommunications and Signal Processing, TSP 2020, Italy, IEEE, 2020, pp. 342 – 345.
- 2019:
- CHALOUPKA, J., A prototype of Audio-Visual Broadcast Transcription System, 42nd International Conference On Telecommunications And Signal Processing, TSP 2019, USA, 2019, pp. 543 – 547.
- MÁLEK, J. a ŽĎÁNSKÝ, J., On Practical Aspects of Multi-condition Training Based on Augmentation for Reverberation-/Noise-Robust Speech Recognition, TSD 2019, 2019, pp. 251 – 263.
- MATĚJŮ, L., ČERVA, P. a ŽĎÁNSKÝ, J., An Approach to Online Speaker Change Point Detection Using DNNs and WFSTs, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2019, 2019, pp. 649 – 653.
- PALEČEK, K., Deep learning for logo detection, 42nd International Conference On Telecommunications And Signal Processing 2019, 2019, pp. 609 – 612.
- 2018:
- MÁLEK, J., ŽĎÁNSKÝ, J., ČERVA P.: Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios, ICASSP 2018, Calgary, Canada, 2018, pp. 5624-5628.
- MÁLEK, J., ŽĎÁNSKÝ, J., ČERVA P.: Robust Recognition of Conversational Telephone Speech via Multi-Condition Training and Data Augmentation, TSD 2018, Brno, Czech Republic, 2018, pp. 324-333.
- MATĚJŮ, L., ČERVA, P., ŽĎÁNSKÝ, J., AND ŠAFAŘÍK R.: Using Deep Neural Networks for Identification of Slavic Languages from Acoustic Signal, Interspeech 2018, Hyderabad, India, 2018, pp. 1803-1807.
- ŠAFAŘÍK, R., MATĚJŮ, L.: Automatic Development of ASR System for an Under-Resourced Language, In proc. of 41st International Conference on Telecommunications and Signal Processing, TSP 2018, Athens, Greece, 2018, pp. 100-103.
- ŠAFAŘÍK, R., MATĚJŮ, L., Weingartová, L.: The Influence of Errors in Phonetic Annotations on Performance of Speech Recognition System, In proc. of 21st International Conference on Text, Speech and Dialogue, TSD 2018, Brno, Czech republic, 2018, pp. 419-427.