Commit 8f84167e authored by Pitti Alexandre's avatar Pitti Alexandre
Browse files

Update README.md

parent c0eada05
# inferno
transfer in progress...
\ No newline at end of file
transfer in progress...
# Repository for the Paper "Brain-inspired model for early vocal learning and correspondence matching using free-energy optimization", PLoS Computational Biology.
*Abstract*
We propose a neural architecture called INFERNO standing for Iterative Free-Energy Optimization of Recurrent Neural Networks. Free-energy (noise) minimization is used for exploring, selecting and learning sound primitives. The whole system is implemented with recurrent spiking neural networks for the learning and retrieving of spike trains constituting the audio memory sequences encoded at the milliseconds order.
# Experiment 1 - compact representation in small dataset
**audio database *14.000 MFCC* (3 minutes length), network size *14.000 MFCC* (compression 1:1); we study here the capacity of reconstruction of the INFERNO architecture**
The audio dataset consists of the repetition of 5 sentences in french repeated three times by a native speaker (young woman). The audio .wav file is translated into **14000 MFCC** vectors (dimension 12) sampled at 25ms each.
The number of Striatal and Gp units are chosen so that the representation of the MFCC vectors is orthogonal, which means that the size for the BG layers corresponds to the number of MFCC to retrieve in the sequence; ie **14000 units**.
### Audio Files (speaker #1, 20 seconds cut)
sentence:
"un homme de bien agit et raisonne en homme de bien, un méchant agit et raisonne en méchant"
Pierre Corneille ; Discours du poème dramatique (1660)
[Original wav file/Filtered]
[Original .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/cut_1_Corneille.wav)
[Filtered .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/cut_1_Corneille_filtered.wav)
[Reconstructed sound, Speaker #1, Sentence #1]
[reconstructed .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/inferno1_compact_encoding_short_sentence_pass1.wav)
[Reconstructed sound, after free-energy minimization period #0]
[reconstructed .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/inferno1_orthogonal_encoding_short_sentence_pass1.wav)
[Reconstructed sound, after free-energy minimization period #1]
[reconstructed .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/inferno1_orthogonal_encoding_short_sentence_pass2.wav)
[Reconstructed sound, after free-energy minimization period #2]
[reconstructed .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/inferno1_orthogonal_encoding_short_sentence_pass3.wav)
[Reconstructed sound, after free-energy minimization period #3]
[reconstructed .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/inferno1_orthogonal_encoding_short_sentence_pass4.wav)
# Experiment 2 - generalization in large dataset
**audio database *140.000 MFCC* (29 minutes length), network size *14.000 MFCC* (compression 1:10); we study here the capacity of generalization of the INFERNO architecture**
Experiment 2 consists on
a bigger audio dataset of 27 minutes length is used in experiment 2 from six native french speakers (same sentence as in Experiment 1), three women and three men. The audio .wav file is translated into **140000 MFCC** vectors (dimension 12) sampled at 25ms each.
The number of Striatal and Gp units are kept the **same** as for the first experiment (ie **14000 units**), which means that the size for the BG layers are ten times lower as the number of MFCC to retrieve in the sequence.
This second experiment will serve to test the *generalization* capabilities of the INFERNO architecture and its robustness to high variabilities.
### Audio Files (speaker #1, 20 seconds cut)
sentence:
"un homme de bien agit et raisonne en homme de bien, un méchant agit et raisonne en méchant"
Pierre Corneille ; Discours du poème dramatique (1660)
[Original wav file/Filtered, Speaker #1 only, Sentence #1]
[Original .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/cut_1_Corneille.wav)
[Filtered .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/cut_1_Corneille_filtered.wav)
[Reconstructed sound, Speaker #1-#6, Sentence #1]
[Speaker1 Sentence1 .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/inferno1_compact_encoding_short_sentence_pass1.wav)
[Speaker2 Sentence1 .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/xp2_speaker2_sentence1.wav)
[Speaker3 Sentence1 .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/xp2_speaker3_sentence1.wav)
[Speaker4 Sentence1 .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/xp2_speaker4_sentence1.wav)
[Speaker5 Sentence1 .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/xp2_speaker5_sentence1.wav)
[Speaker6 Sentence1 .WAV](https://promethe.u-cergy.fr/alexpitt/inferno/blob/master/generated_wav/xp2_speaker6_sentence1.wav)
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment