2023年经济学人 ChatGPT与乔姆斯基: 人类如何习得语言(2)

时间：2024-01-31 02:30:52

搜索关注在线英语听力室公众号：tingroom，领取免费英语资料大礼包。

（单词翻译）

Yet it is hard, for several reasons, to fathom¹ what LLMs "think".

但由于一些原因，很难弄清楚大型语言模型在"想什么"。

Details of the programming and training data of commercial ones like ChatGPT are proprietary².

像ChatGPT这种商业语言模型的编程和训练数据的细节为其公司所有。

And not even the programmers know exactly what is going on inside.

连程序员都不知道模型内部到底发生了什么。

Linguists³ have, however, found clever ways to test LLMs' underlying⁴ knowledge, in effect tricking them with probing tests.

然而，语言学家已经找到了测试大型语言模型的潜在知识的巧妙办法，实际上就是用试探性的测试来欺骗它们。

And indeed, LLMs seem to learn nested, hierarchical grammatical structures, even though they are exposed to only linear input⁵, ie, strings⁶ of text.

事实上，虽然大型语言模型只接触了线性的输入内容，即文本字符串，但它们似乎学会了嵌套的层级语法结构。

They can handle novel words and grasp parts of speech.

它们能够处理新造词并掌握词性。

Tell ChatGPT that "dax" is a verb meaning to eat a slice of pizza by folding it, and the system deploys⁷ it easily: "After a long day at work, I like to relax and dax on a slice of pizza while watching my favourite TV show."告诉ChatGPT，dax是一个动词，意思是把披萨折叠起来吃，然后它就能轻松地运用这个单词："结束了漫长的一天的工作之后，我喜欢放松一下，一边把披萨折叠起来吃，一边看我最喜欢的电视节目。

(The imitative element can be seen in "dax on", which ChatGPT probably patterned on the likes of "chew on" or "munch⁸ on".)(模仿的元素可以从dax on的例子中看出，ChatGPT很可能是仿照了chew on或munch on等表示咀嚼之类的词汇。)What about the "poverty of the stimulus"?

那么"语言刺激贫瘠"方面又是什么情况呢？

After all, GPT-3 (the LLM underlying ChatGPT until the recent release of GPT-4) is estimated to be trained on about 1,000 times the data a human ten-year-old is exposed to.

毕竟，GPT-3(在最近发布GPT-4之前，它是ChatGPT的底层语言模型)接受的训练数据大约是一个十岁人类儿童所接触的数据量的1000倍。

That leaves open the possibility that children have an inborn⁹ tendency to grammar, making them far more proficient¹⁰ than any LLM.

这就带来了一种可能性，那就是儿童天生就有语法倾向，这使他们比任何大型语言模型都更擅长语言。

In a forthcoming paper in Linguistic¹¹ Inquiry¹², researchers claim to have trained an LLM on no more text than a human child is exposed to, finding that it can use even rare bits of grammar.

在即将发表在《语言研究》杂志上的一篇论文中，研究人员声称，用不超过人类儿童所接触的文本量的数据训练一个大型语言模型，发现它甚至可以使用很少见的语法。

But other researchers have tried to train an LLM on a database of only child-directed language (that is, of transcripts¹³ of carers speaking to children).

但其他研究人员试图用只面向儿童的语言数据库(即照顾孩子的人对儿童所说话语的文字稿)来训练大型语言模型。

Here LLMs fare far worse.

在这种情况下，大型语言模型的表现要糟糕得多。

Perhaps the brain really is built for language, as Professor Chomsky says.

也许正如乔姆斯基教授所说，大脑真的是为语言而生的。

It is difficult to judge.

结果很难判断。

Both sides of the argument are marshalling LLMs to make their case.

争论的双方都带领着大型语言模型来证明他们的观点。

The eponymous founder¹⁴ of his school of linguistics¹⁵ has offered only a brusque riposte.

创立了其同名语言学派的乔姆斯基对此只给出了简短的回应。

For his theories to survive this challenge, his camp will have to put up a stronger defence.

要想让他的理论经受住这次挑战，乔姆斯基阵营将必须给出更强有力的辩词。

分享到：

点击

收听单词发音

1 fathom
v.领悟，彻底了解
参考例句：
I really couldn't fathom what he was talking about.我真搞不懂他在说些什么。 What these people hoped to achieve is hard to fathom.这些人希望实现些什么目标难以揣测。

2 proprietary
n.所有权，所有的；独占的；业主
参考例句：
We had to take action to protect the proprietary technology.我们必须采取措施保护专利技术。 Proprietary right is the foundation of jus rerem.所有权是物权法之根基。

3 linguists
n.通晓数国语言的人( linguist的名词复数 )；语言学家
参考例句：
The linguists went to study tribal languages in the field. 语言学家们去实地研究部落语言了。来自辞典例句 The linguists' main interest has been to analyze and describe languages. 语言学家的主要兴趣一直在于分析并描述语言。来自辞典例句

4 underlying
adj.在下面的，含蓄的，潜在的
参考例句：
The underlying theme of the novel is very serious.小说隐含的主题是十分严肃的。 This word has its underlying meaning.这个单词有它潜在的含义。

5 input
n.输入(物)；投入；vt.把(数据等)输入计算机
参考例句：
I will forever be grateful for his considerable input.我将永远感激他的大量投入。 All this information had to be input onto the computer.所有这些信息都必须输入计算机。

6 strings
n.弦
参考例句：
He sat on the bed,idly plucking the strings of his guitar.他坐在床上，随意地拨着吉他的弦。 She swept her fingers over the strings of the harp.她用手指划过竖琴的琴弦。

7 deploys
（尤指军事行动）使展开( deploy的第三人称单数 )；施展；部署；有效地利用
参考例句：
It then deploys "decoy" programs designed to attract the virus. 然后，它释放“诱饵”去吸引病毒。 But when that doesn't work, he deploys his secret defense mechanism. 但没有效果，它要施展绝密自卫武器了。

8 munch
v.用力嚼，大声咀嚼
参考例句：
We watched her munch through two packets of peanuts.我们看她津津有味地嚼了两包花生米。 Getting them to munch on vegetable dishes was more difficult.使他们吃素菜就比较困难了。

9 inborn
adj.天生的，生来的，先天的
参考例句：
He is a man with an inborn love of joke.他是一个生来就喜欢开玩笑的人。 He had an inborn talent for languages.他有语言天分。

10 proficient
adj.熟练的，精通的；n.能手，专家
参考例句：
She is proficient at swimming.她精通游泳。 I think I'm quite proficient in both written and spoken English.我认为我在英语读写方面相当熟练。

11 linguistic
adj.语言的，语言学的
参考例句：
She is pursuing her linguistic researches.她在从事语言学的研究。 The ability to write is a supreme test of linguistic competence.写作能力是对语言能力的最高形式的测试。

12 inquiry
n.打听，询问，调查，查问
参考例句：
Many parents have been pressing for an inquiry into the problem.许多家长迫切要求调查这个问题。 The field of inquiry has narrowed down to five persons.调查的范围已经缩小到只剩5个人了。

13 transcripts
n.抄本（ transcript的名词复数）；转写本；文字本；副本
参考例句：
Like mRNA, both tRNA and rRNA are transcripts of chromosomal DNA. tRNA及rRNA同mRNA一样，都是染色体DNA的转录产物。来自辞典例句 You can't take the transfer students'exam without your transcripts. 没有成绩证明书，你就不能参加转学考试。来自辞典例句

14 Founder
n.创始者，缔造者
参考例句：
He was extolled as the founder of their Florentine school.他被称颂为佛罗伦萨画派的鼻祖。 According to the old tradition,Romulus was the founder of Rome.按照古老的传说，罗穆卢斯是古罗马的建国者。

15 linguistics
n.语言学
参考例句：
She plans to take a course in applied linguistics.她打算学习应用语言学课程。 Linguistics is a scientific study of the property of language.语言学是指对语言的性质所作的系统研究。

本文本内容来源于互联网抓取和网友提交，仅供参考，部分栏目没有内容，如果您有更合适的内容，欢迎点击提交分享给大家。