Training;Degradation;Transducers;Costs;Conferences;Speech enhancement;Transformers;automatic speech recognition;speech translation;streaming;serialized output training