Determination of protein composition in milk by mid-infrared spectrometry M. Ferrand 1 , G. Miranda 2 , H. Larroque 4 , S. Guisnel 1 , O. Leray 5 , F. Lahalle 1,3 , M. Brochard 1 , P. Martin 2 (1) Institut de l’Elevage (2) INRA GABI (3) CNIEL (4) INRA SAGA (5) Actilait
23
Embed
Determination of protein composition in milk by mid ... · 6 main milk proteins. α. s1-Casein. ... Spectrum from 75 cow milk samples ... Many thanks to every partners of the project
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Determination of protein composition in milk by mid-infrared spectrometry
M. Ferrand1, G. Miranda2, H. Larroque4, S. Guisnel1,O. Leray5, F. Lahalle1,3,M. Brochard1, P. Martin2
(1) Institut de l’Elevage(2) INRA GABI(3) CNIEL(4) INRA SAGA(5) Actilait
Outline
Context and motivationsMaterials and methodsResultsConclusions and perspectives
ICAR - 28/05/2012 2
3
Context
Milk = complex product with a lot of components
nutritional intereststechnological properties
no cheap and large scale easy to use method to measure all milk components
ICAR - 28/05/2012
4
PhénoFinlait: aims
Develop and control methods to analyze fine milk composition easily
Use the analytical development to - study genetic and feeding management impact on milk composition- build up new tools to manage milk composition (Dairy Herd Improvement (DHI) and genomic)
ICAR - 28/05/2012
Outline
Context and motivationsMaterials and methodsResultsConclusions and perspectives
5
6
Major milk proteins
6 main milk proteinsαs1-Caseinαs2-Caseinβ-Caseinκ-Casein
β-Lactoglobulin (β-LG)α-Lactalbumin (α-LA)
Caseins (ca. 80%)
whey proteins (ca. 20%)
ICAR - 28/05/2012
Reference method (Miranda et al.)
Need to establish a reference method to identify and quantify major milk proteins:
Liquid Chromatography + Mass Spectrometry (LC-MS)
Creation of a database of masses including geneticvariants, splicing variants, post-translationalmodifications and main proteolysis products.
Method in routineMIR spectra routinely obtained by milk recording laboratories for fat and protein percentage measurements
Already used to estimate FA and protein composition in cow milk (Soyeurt, 2006 – Rutten, 2011 –Bonfatti, 2011)
9
0.4
0.6
0.8
1.0
1.2
Spectrum from 75 cow milk samples (UE INRA Mirecourt + Domaine du Pin) MilkoScan FT6000 (Foss Electric, Hillerod, Denmark)
LILANO (Milk recording laboratory)
926
964
1002
1040
1078
1116
1154
1192
1230
1268
1306
1344
1382
1420
1458
1496
1716
1754
1792
1830
1868
1906
1944
1982
2020
2058
2096
2134
2172
2210
2248
2474
2514
2554
2594
2634
2674
2714
2754
2794
2834
2874
2914
2954
Wavelengths cm −1
Abs
orba
nce
10
Development of equations• Traditionally by PLS regression• Pretreatments can be useful to eliminate spectral
variations derivation to eliminate uncontrolled spectral variations (Soyeurt, 2011)
• Several authors have suggested to apply a selection of variables before PLS regression to improve results (Leardi 1998, Hoskuldsson 2001)
• Genetic algorithms already successfully used on IR data (Leardi 1998, Gomez-Carracedo 2007)
Previous study on fatty acids with good results (Ferrand, 2009)
• In genomic selection penalization method like LASSO, Ridge Regression or Elastic Net are used (Croiseau, 2011)
ICAR - 28/05/2012
11
Genetic algorithms method
• Optimization method based on evolutionary biology
• Principle: evolution of a population of solutions (=wavelength selection) using genetic operators like reproduction, mutation and selection
• Objective: obtain a population with the best solutions (=wavelength selection)
ICAR - 28/05/2012
Penalization methodAim: to reduce the variance of estimators to guarantee the stability of the estimations• Ridge Regression (RR): all the predictors are
kept• LASSO: some coefficients are set to zero and
in presence of collinearity, only one predictorof the group is retained
• Elastic Net (EN): combination of RR and LASSO (two penalization parameters) more flexible
12ICAR - 28/05/2012
13
Samples analyzed
• 193 cow milk samples from Holstein, Normandeand Montbéliarde cows analyzed by MIR spectrometry and the reference method
• 153 ewe’s milk samples from Lacaune and Manechtête rousse
• 153 goat milk samples from Saanen and Alpine
ICAR - 28/05/2012
Présentateur
Commentaires de présentation
.
Outline
Context and motivationsMaterials and methodsResultsConclusions and perspectives
14
15
Cow milk: selected wavelengths
• 2272-1944 cm-1 band rarely selected• 2970-2278 cm-1 and 2272-1944 cm-1 selected for
most proteins
ICAR - 28/05/2012
GA LASSO ENNumber of retainedwavelengths 8 to 83 4 to 29 22 to 68
Context and motivationsMaterials and methodsResultsConclusions and perspectives
19
20
Conclusions
• In first place, to have robust equations, it seemsfundamental to have a robust sample dataset with variability and accurate measurements by the reference method
• Gain of accuracy by reallocating the proteolysis• To implement these equations at a large scale, it
is also central to establish an harmonization system between laboratories (Leray et al., 2011)
ICAR - 28/05/2012
Many thanks to every partners of the project
Thank you for you attention !
N solutions generated at random
STOP
Each variable has a mutation probability of x% (1 no selected variable become selected and conversely)Objective : avoid having a pool of uniform solutions
Substitution of the 2 worst solutions by new solutions
Combination of 2 solutionsObjective : to obtain 2 better solutionsLimit : variability of solutions decreases
CREATION of a NEW POOL of SOLUTIONS
INITIAL POPULATION : POOL OF SOLUTIONS (30)
POOL of SOLUTIONS EVALUATION of THESE
SOLUTIONS
Possibility of MUTATION
Possibility of
CROSS-OVER
REPRODUCTION Selection of 2 solutionsThe better a solution is, the highest the probability of being chosen is
R2CV
Solution 1Solution 2
Solution N
Var1 Var2… Var446
11
0
10
1
11
0
Variable i takes value of 1 if selected , else 0. R2CV is obtained by PLS regression on selected variables.
……
…
…
…
…
When quality of solutions is constant, algorithm is stopped.
Getting N solutions among the bestsFINAL RESULT
Random selection
Cross-over probability (50%)
Mutation probability (1%)
Random generation
Evaluation
= Random
adapted from Haupt (2004)and Leardi (1998)
22
23
Genetic algorithms use
• Use of the algorithm developed by Leardi• Check the robustness by varying parameters
(previous study)• Fitness function: cross-validated explained variance• Population size: 30 solutions• Mutation probability: 1%• Number of GA runs: 5 (to ensure an optimal