PhD internship (Speech AI)

Apply now

PhD intern
(Speech AI)

Internship · Paris

About the role

We're looking for a PhD intern, also called Visiting PhD, to work within our research and help us build the next generation of speech AI technology.

Internship description:

As a PhD intern, you will be part of a team of experienced researchers and will actively work on cutting-edge models and push the boundaries of what's possible with voice synthesis and recognition. Your main missions will be as follows:

  1. Definition of the internship subject: in collaboration with your supervisor, you will participate in defining a precise research subject according to your profile, your current research interests and ours.

  2. Literature review: you will carry out an in-depth study of the state of the art on the chosen subject (reading scientific articles, understanding existing techniques and potentially reproducing certain experimental results).

  3. Formulation of hypotheses: you will critically analyze the state of the art, you will generate new hypotheses and, in conjunction with your supervisor, you will formulate clear and testable research hypotheses, specifying the expected results.

  4. Development of models and methods: you will then explore new models or new learning techniques, through the design of the architecture of the model and its optimization algorithm, its training with suitable datasets (to be built if necessary) and, finally, its evaluation by calculating performance metrics on existing benchmarks and comparing it to baselines and alternative models. Software development and experiments will be conducted according to the best practices in the field.

  5. Publication of results: where applicable, you will contribute to the publication of the results of your research by writing a research article or paper (and potentially submitting it to scientific conferences), and to the publication of code on platforms such as GitHub or Hugging Face.

Qualifications

  • On-going PhD in AI/ML in the field of Audio (ideally Speech), with the possibility of taking a 4-6 month break to complete an internship in a professional environment.

  • Solid knowledge of mathematics and algorithms.

  • Advanced understanding of the fundamental concepts of Machine Learning and Deep Learning (in particular DL architectures: Transformers, CNNs, RNNs, ...), concepts ideally applied in audio signal processing, speech processing or natural language processing.

  • Experience in using ML frameworks (PyTorch, Jax or TensorFlow) and strong programming skills in Python and experience with PyTorch or TensorFlow.

  • Intellectual curiosity, scientific rigor and team spirit.

  • Experience with large-scale distributed training.

Nice to Have

  • Published research in top-tier ML conferences (NeurIPS, ICML, ICLR, etc.)

  • Experience with speech synthesis models (Tacotron, FastSpeech, VITS, etc.)

  • Experience with model optimization and quantization

Apply for the role

Do you want to join our team as our new PhD intern? Then we'd love to hear about you!