Підтримка
www.wikidata.uk-ua.nina.az
Vklada nnya sli v angl word embedding ce zagalna nazva nizki metodik movnogo modelyuvannya ta navchannya oznak v obrobci prirodnoyi movi OPM v yakih slova abo frazi zi slovnika vidobrazhuyut u vektori dijsnih chisel Konceptualno vono daye matematichne vkladennya z prostoru z bagatma vimirami po odnomu na slovo do neperervnogo vektornogo prostoru nabagato nizhchoyi rozmirnosti Do metodiv porodzhuvannya cogo vidobrazhennya nalezhat nejronni merezhi znizhennya rozmirnosti na en sliv imovirnisni modeli metod poyasnennoyi bazi znan ta yavne predstavlennya v terminah kontekstu v yakomu z yavlyayutsya slova Bulo pokazano sho vkla dennya sliv ta fraz koli yih vikoristovuyut yak bazove predstavlennya vhodu pidsilyuyut produktivnist v zadachah OPM takih yak sintaksichnij analiz ta analiz tonalnosti tekstu Rozvitok ta istoriya cogo pidhoduV movoznavstvi vkladannya sliv obgovoryuvali v doslidnickij oblasti distributivnoyi semantiki Yiyi metoyu ye kilkisne ocinyuvannya ta kategorizuvannya semantichnih podibnostej movoznavchih elementiv na osnovi yihnih rozpodilnih vlastivostej u velikih vibirkah movnih danih Osnovnu ideyu sho slovo harakterizuyetsya tovaristvom yakogo vono trimayetsya populyarizuvav en Ponyattya semantichnogo prostoru z leksichnimi elementami slovami abo kilkaslivnimi terminami predstavlenimi yak vektori abo vkladennya gruntuyetsya na obchislyuvalnih viklikah vlovlyuvannya rozpodilnih vlastivostej ta vikoristannya yih dlya praktichnogo zastosuvannya shobi vimiryuvati podibnist mizh slovami frazami abo cilimi dokumentami Pershim pokolinnyam modelej semantichnogo prostoru ye vektornoyi modeli dlya informacijnogo poshuku Taki vektorni modeli dlya sliv ta yihnih rozpodilnih danih vtileni u svoyemu najprostishomu viglyadi dayut v rezultati duzhe rozridzhenij vektornij prostir visokoyi rozmirnosti por proklyattya rozmirnosti Znizhennya chisla vimiriv iz zastosuvannyam linijnih algebrichnih metodiv takih yak singulyarnij rozklad matrici prizvelo potim do vprovadzhennya latentno semantichnogo analizu naprikinci 1980 h ta pidhodu en dlya zbirannya kontekstiv sumizhnosti sliv 2000 roku en ta in zaprovadili v nizci prac Nejronni jmovirnisni modeli movi dlya znizhennya visokoyi rozmirnosti predstavlen sliv u kontekstah shlyahom navchannya rozpodilenogo predstavlennya dlya sliv Vkladannya sliv buvaye dvoh riznih stiliv v odnomu slova virazhayut yak vektori sumizhnih sliv a v inshomu slova virazhayut yak movoznavchi konteksti v yakih ci slova traplyayutsya ci rizni stili doslidzheno v praci Lavelli ta in 2004 roku Rovejs ta Sol opublikuvali v Science yak vikoristovuvati lokalno linijne vkladannya LLV angl locally linear embedding LLE shobi viyavlyati predstavlennya struktur danih visokoyi rozmirnosti Bilshist novih metodik vkladannya sliv pislya 2005 roku pokladayutsya na nejromerezhnu arhitekturu zamist bilsh imovirnisnih ta algebrichnih modelej z chasu deyakih zasadnichih prac Joshua Benzhio z kolegami Cej pidhid bulo perejnyato bagatma doslidnickimi grupami pislya zroblenih blizko 2010 roku vdoskonalen v teoretichnij praci nad yakistyu vektoriv ta shvidkistyu trenuvannya ciyeyi modeli ta aparatnih dosyagnen sho dali mozhlivist z koristyu doslidzhuvati shirshij prostir parametriv 2013 roku komanda v Google pid provodom Tomasha Mikolova stvorila word2vec instrumentarij vkladannya sliv sho mozhe trenuvati vektorni modeli shvidshe za poperedni pidhodi Pidhid word2vec shiroko vikoristovuvali v eksperimentah vin spriyav pidvishennyu zacikavlennya vkladannyami sliv yak tehnologiyeyu zmishuyuchi napryamok doslidzhen vid specializovanih doslidzhen do shirshih eksperimentiv i vreshti resht proklavshi shlyah do praktichnogo zastosuvannya ObmezhennyaOdnim z golovnih obmezhen vkladan sliv vektornih modelej sliv u cilomu ye te sho slova z kilkoma znachennyami ob yednuyutsya v yedine predstavlennya yedinij vektor v semantichnomu prostori Inshimi slovami bagatoznachnist ta omonimiya ne obroblyayutsya nalezhnim chinom Napriklad v rechenni The club I tried yesterday was great ne yasno chi termin club stosuyetsya slovosensu en bejsbolnij klub en en chi bud yakogo inshogo sensu yakij mozhe mati slovo club Neobhidnist rozmishennya dekilkoh sensiv na slovo v riznih vektorah bagatosensovi vkladennya angl multi sense embeddings stalo motivaciyeyu dekilkoh vneskiv do OPM dlya rozdilennya odnosensovih vkladen na bagatosensovi Bilshist pidhodiv yaki viroblyayut bagatosensovi vkladennya mozhe buti podileno na dvi golovni kategoriyi zgidno yihnogo predstavlennya sensu a same spontanni ta na osnovi znan Zasnovanij na word2vec ovij propusk grami angl skip gram bagatosensovij propusk gram BSPG angl Multi Sense Skip Gram MSSG vikonuye slovosensove rozdilennya ta vkladannya odnochasno vdoskonalyuyuchi trivalist svogo trenuvannya vihodyachi z pevnogo chisla sensiv dlya vsih sliv V neparametrichnij bagatosensovij propusk grami NP BSPG angl Non Parametric Multi Sense Skip Gram NP MSSG ce chislo mozhe minitisya zalezhno vid kozhnogo slova Poyednuyuchi poperednye znannya leksichnih baz danih napriklad WordNet en en vkladennya sliv ta vodnoznachnennya sensu sliv anotuvannya najbilsh pidhozhim sensom ANPS angl Most Suitable Sense Annotation MSSA mitit slovosensi shlyahom spontannogo pidhodu ta pidhodu na osnovi znan rozglyadayuchi kontekst slova v napered viznachenomu kovznomu vikni Shojno slova bulo vodnoznachneno yih mozhlivo vikoristovuvati v standartnij metodici vkladannya sliv tozh viroblyayutsya bagatosensovi vkladennya Arhitektura ANPS dozvolyaye procesovi vodnoznachnyuvannya ta anotuvannya vikonuvatisya rekurentno samovdoskonalyuvalnim chinom Vidomo sho zastosuvannya bagatosensovih vkladen pokrashuye produktivnist v nizci zadach OPM takih yak rozmichuvannya chastin movi identifikaciya semantichnih vidnoshen ta en Prote shozhe sho zadachi pov yazani z rozpiznavannyam imenovanih sutnostej ta analizom tonalnosti tekstu vid predstavlennya kilkoma vektorami ne vigrayu t Dlya biologichnih poslidovnostej BioVektoriVkladennya sliv dlya N gramiv u biologichnih poslidovnostyah napriklad DNK RNK ta proteyinah dlya zastosuvan u bioinformatici bulo zaproponovano Asgari ta Mofradom Nazvane bio vektorami BioVek angl bio vectors BioVec dlya poznachuvannya biologichnih poslidovnostej v cilomu j proteyin vektorami ProtVek angl protein vectors ProtVec dlya proteyiniv poslidovnostej aminokislot ta gen vektorami GenVek angl gene vectors GeneVec dlya poslidovnostej geniv ce predstavlennya mozhlivo shiroko vikoristovuvati v zastosuvannyah glibokogo navchannya v proteomici ta genomici Rezultati predstavleni Asgari ta Mofradom dozvolyayut pripustiti sho BioVektori mozhut harakterizuvati biologichni poslidovnosti v terminah biohimichnih ta biofizichnih interpretacij zakonomirnostej sho lezhat v yih osnovi Vektori dumok en angl thought vectors ce rozshirennya vkladannya sliv na cili rechennya abo navit dokumenti Deyaki doslidniki spodivayutsya sho voni mozhut vdoskonaliti yakist mashinnogo perekladu Programne zabezpechennyaDo programnogo zabezpechennya trenuvannya ta vikoristannya vkladen sliv nalezhat word2vec Tomasha Mikolova GloVe Stenfordskogo universitetu GN GloVe ELMo AllenNLP BERT fastText en Indra ta Deeplearning4j Dlya znizhuvannya rozmirnosti prostoru vektoriv sliv ta unaochnyuvannya vkladen sliv ta en vikoristovuyut yak metod golovnih komponent MGK angl PCA tak i t rozpodilene vkladennya stohastichnoyi blizkosti angl t SNE Prikladi zastosuvannya Napriklad fastText takozh vikoristovuyut shobi obchislyuvati vkladennya sliv dlya korpusiv tekstiv u Sketch Engine dostupnih onlajn Div takozh en PrimitkiMikolov Tomas Sutskever Ilya Chen Kai Corrado Greg Dean Jeffrey 2013 Distributed Representations of Words and Phrases and their Compositionality arXiv 1310 4546 cs CL angl Lebret Remi Collobert Ronan 2013 Word Emdeddings through Hellinger PCA Conference of the European Chapter of the Association for Computational Linguistics EACL 2014 arXiv 1312 5542 Bibcode 2013arXiv1312 5542L angl Levy Omer Goldberg Yoav 2014 NIPS Arhiv originalu PDF za 14 listopada 2016 Procitovano 20 zhovtnya 2020 angl Li Yitan Xu Linli 2015 PDF Int l J Conf on Artificial Intelligence IJCAI Arhiv originalu PDF za 6 veresnya 2015 Procitovano 20 zhovtnya 2020 angl Globerson Amir 2007 PDF Journal of Machine Learning Research Arhiv originalu PDF za 21 veresnya 2016 angl Qureshi M Atif Greene Derek 4 chervnya 2018 EVE explainable vector based embedding technique using Wikipedia Journal of Intelligent Information Systems angl 53 137 165 arXiv 1702 06891 doi 10 1007 s10844 018 0511 x ISSN 0925 9902 S2CID 10656055 angl Levy Omer Goldberg Yoav 2014 PDF CoNLL s 171 180 Arhiv originalu PDF za 25 veresnya 2017 Procitovano 20 zhovtnya 2020 angl Socher Richard Bauer John Manning Christopher Ng Andrew 2013 PDF Proc ACL Conf Arhiv originalu PDF za 11 serpnya 2016 Procitovano 20 zhovtnya 2020 angl Socher Richard Perelygin Alex Wu Jean Chuang Jason Manning Chris Ng Andrew Potts Chris 2013 PDF EMNLP Arhiv originalu PDF za 28 grudnya 2016 Procitovano 20 zhovtnya 2020 angl Firth J R 1957 A synopsis of linguistic theory 1930 1955 Studies in Linguistic Analysis 1 32 Peredrukovano v F R Palmer red 1968 Selected Papers of J R Firth 1952 1959 London Longman angl Salton Gerard 1962 Proceeding AFIPS 62 Fall Proceedings of the December 4 6 1962 fall joint computer conference 234 250 Arhiv originalu za 18 zhovtnya 2020 Procitovano 18 zhovtnya 2020 angl Salton Gerard Wong A Yang C S 1975 A Vector Space Model for Automatic Indexing Communications of the Association for Computing Machinery CACM 613 620 angl Dubin David 2004 Arhiv originalu za 18 zhovtnya 2020 Procitovano 18 zhovtnya 2020 angl Sahlgren Magnus Arhiv originalu za 21 chervnya 2020 Procitovano 20 zhovtnya 2020 angl Kanerva Pentti Kristoferson Jan and Holst Anders 2000 Random Indexing of Text Samples for Latent Semantic Analysis Proceedings of the 22nd Annual Conference of the Cognitive Science Society p 1036 Mahwah New Jersey Erlbaum 2000 angl Karlgren Jussi Sahlgren Magnus 2001 Uesaka Yoshinori Kanerva Pentti Asoh Hideki red From words to understanding Foundations of Real World Intelligence CSLI Publications 294 308 angl Sahlgren Magnus 2005 An Introduction to Random Indexing 21 zhovtnya 2020 u Wayback Machine Proceedings of the Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering TKE 2005 August 16 Copenhagen Denmark angl Sahlgren Magnus Holst Anders and Pentti Kanerva 2008 Permutations as a Means to Encode Order in Word Space 9 lipnya 2009 u Wayback Machine In Proceedings of the 30th Annual Conference of the Cognitive Science Society 1300 1305 angl Ducharme Rejean Vincent Pascal Jauvin Christian 2003 PDF Journal of Machine Learning Research 3 1137 1155 Arhiv originalu PDF za 29 zhovtnya 2020 Procitovano 20 zhovtnya 2020 angl Bengio Yoshua Schwenk Holger Senecal Jean Sebastien Morin Frederic Gauvain Jean Luc 2006 A Neural Probabilistic Language Model T 194 s 137 186 doi 10 1007 3 540 33486 6 6 ISBN 978 3 540 30609 2 a href wiki D0 A8 D0 B0 D0 B1 D0 BB D0 BE D0 BD Cite book title Shablon Cite book cite book a Proignorovano journal dovidka angl Lavelli Alberto Sebastiani Fabrizio Zanoli Roberto 2004 Distributional term representations an experimental comparison 13th ACM International Conference on Information and Knowledge Management s 615 624 doi 10 1145 1031171 1031284 angl Roweis Sam T Saul Lawrence K 2000 Nonlinear Dimensionality Reduction by Locally Linear Embedding Science 290 5500 2323 6 Bibcode 2000Sci 290 2323R CiteSeerX 10 1 1 111 3313 doi 10 1126 science 290 5500 2323 PMID 11125150 angl Morin Fredric Bengio Yoshua 2005 Hierarchical probabilistic neural network language model AIstats 5 246 252 angl Mnih Andriy Hinton Geoffrey 2009 Advances in Neural Information Processing Systems 21 NIPS 2008 Curran Associates Inc 1081 1088 Arhiv originalu za 3 veresnya 2020 Procitovano 20 zhovtnya 2020 angl Arhiv originalu za 3 listopada 2020 Procitovano 20 zhovtnya 2020 Reisinger Joseph Mooney Raymond J 2010 T Human Language Technologies The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics Los Angeles California Association for Computational Linguistics s 109 117 ISBN 978 1 932432 65 7 Arhiv originalu za 25 zhovtnya 2019 Procitovano 25 zhovtnya 2019 angl Huang Eric 2012 Improving word representations via global context and multiple word prototypes OCLC 857900050 angl Camacho Collados Jose Pilehvar Mohammad Taher 2018 From Word to Sense Embeddings A Survey on Vector Representations of Meaning arXiv 1805 04032 Bibcode 2018arXiv180504032C angl Neelakantan Arvind Shankar Jeevan Passos Alexandre McCallum Andrew 2014 Efficient Non parametric Estimation of Multiple Embeddings per Word in Vector Space Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing EMNLP Stroudsburg PA USA Association for Computational Linguistics 1059 1069 arXiv 1504 06654 doi 10 3115 v1 d14 1113 S2CID 15251438 angl Ruas Terry Grosky William Aizawa Akiko 1 grudnya 2019 Multi sense embeddings through a word sense disambiguation process Expert Systems with Applications 136 288 303 doi 10 1016 j eswa 2019 06 026 2027 42 145475 ISSN 0957 4174 angl Li Jiwei Jurafsky Dan 2015 Do Multi Sense Embeddings Improve Natural Language Understanding Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing Stroudsburg PA USA Association for Computational Linguistics 1722 1732 arXiv 1506 01070 doi 10 18653 v1 d15 1200 S2CID 6222768 angl Asgari Ehsaneddin Mofrad Mohammad R K 2015 Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics PLOS ONE 10 11 e0141287 arXiv 1503 05140 Bibcode 2015PLoSO 1041287A doi 10 1371 journal pone 0141287 PMC 4640716 PMID 26555596 a href wiki D0 A8 D0 B0 D0 B1 D0 BB D0 BE D0 BD Cite journal title Shablon Cite journal cite journal a Obslugovuvannya CS1 Storinki iz nepoznachenim DOI z bezkoshtovnim dostupom posilannya angl Kiros Ryan Zhu Yukun Salakhutdinov Ruslan Zemel Richard S Torralba Antonio Urtasun Raquel Fidler Sanja 2015 skip thought vectors arXiv 1506 06726 cs CL angl Arhiv originalu za 19 grudnya 2016 Procitovano 20 zhovtnya 2020 Zhao Jieyu Learning Gender Neutral Word Embeddings arXiv 1809 01496 cs CL a href wiki D0 A8 D0 B0 D0 B1 D0 BB D0 BE D0 BD Cite arXiv title Shablon Cite arXiv cite arXiv a Proignorovano nevidomij parametr collaboration dovidka angl Arhiv originalu za 29 zhovtnya 2020 Procitovano 20 zhovtnya 2020 Pires Telmo Schlinger Eva Garrette Dan 4 chervnya 2019 How multilingual is Multilingual BERT arXiv 1906 01502 cs CL angl Arhiv originalu za 3 sichnya 2017 Procitovano 20 zhovtnya 2020 25 zhovtnya 2018 Arhiv originalu za 5 sichnya 2021 Procitovano 20 zhovtnya 2020 Ghassemi Mohammad Mark Roger Nemati Shamim 2015 PDF Computing in Cardiology Arhiv originalu PDF za 31 travnya 2016 angl Embedding Viewer Lexical Computing Arhiv originalu za 8 lyutogo 2018 Procitovano 7 lyutogo 2018
Топ