000 01736nam a2200313Ia 4500
003 MX-MdCICY
005 20251009160708.0
040 _cCICY
090 _aB-21897
245 1 0 _a Simulating 500 million years of evolution with a language model.
490 0 _aScience, 387(6736), 850-858, 2025.
500 _aArtículo
520 3 _aMore than 3 billion years of evolution have produced an image of biology encoded into the space of natural proteins. Here, we show that language models trained at scale on evolutionary data can generate functional proteins that are far away from known proteins. We present ESM3, a frontier multimodal generative language model that reasons over the sequence, structure, and function of proteins. ESM3 can follow complex prompts combining its modalities and is highly responsive to alignment to improve its fidelity. We have prompted ESM3 to generate fluorescent proteins. Among the generations that we synthesized, we found a bright fluorescent protein at a far distance (58% sequence identity) from known fluorescent proteins, which we estimate is equivalent to simulating 500 million years of evolution.
650 1 4 _aCOMPUTER SIMULATION
650 1 4 _aEVOLUTION, MOLECULAR
650 1 4 _aLANGUAGE
650 1 4 _aLUMINESCENT PROTEINS
650 1 4 _aSEQUENCE ALIGNMENT
700 1 2 _aHayes, T.
700 1 2 _aRao, R.
700 1 2 _aAkin, H.
700 1 2 _aSofroniew, N. J.
700 1 2 _aOktay, D.
700 1 2 _aLin, Z.
700 1 2 _aRives, A.
856 4 0 _uhttps://drive.google.com/file/d/17VFIWMKQlXGL5mCBizj9KcLEmzSRA2uX/view?usp=drive_link
_zPara ver el documento ingresa a Google con tu cuenta: @cicy.edu.mx
942 _2Loc
_cREF1
008 251009s9999 xx 000 0 und d
999 _c61986
_d61986