[SIGCIS-Members] Perplexity

Wed May 7 06:52:35 PDT 2025

Hi Adam:

Thanks – I hadn’t realised that URL links to Perplexity’s answers aren’t transferable.

It is certainly the case that Perplexity (which is based on ChatGPT4 I believe) is much more “knowledgeable” about Percy Ludgate than ChatGPT was – not least because it has ingested subsequent additions to the Internet.

Indeed, given the huge amount of effort and money being poured into LLM research we surely should expect the latest models to be significantly improved.

And Perplexity’s answer to my question “Are the references that Perplexity provides always real ones” was very reasonable, one might even say “thoughtful”, though its cautionary comments have of course to be applied to this answer itself – a nice example of recursion.

However, the NY Times had a very interesting, and surely worrying (for the Artificial Intelligentsia, especially) piece yesterday entitled: “A.I. Is Getting More Powerful, but Its Hallucinations Are Getting Worse: A new wave of “reasoning” systems from companies like OpenAI is producing incorrect information more often. Even the companies don’t know why.”

I hope this link to it will work:
https://www.nytimes.com/2025/05/05/technology/ai-hallucinations-chatgpt-google.html?unlocked_article_code=1.E08.mKzL.qenddVF5UQVq&smid=url-share

And on the subject of “hallucinations” (or rather “fabrications”), I also strongly recommend Gary Marcus’s very recent piece:

  Why DO large language models hallucinate?<https://garymarcus.substack.com/p/why-do-large-language-models-hallucinate>
(https://garymarcus.substack.com/p/why-do-large-language-models-hallucinate)

Let me end by quoting from it:

“Because LLMS statistically mimic the language people have used, they often fool people into thinking that they operate like people. But they don’t operate like people. They don’t, for example, ever fact check (as humans sometimes, when well motivated, do). They mimic the kinds of things of people say in various contexts. And that’s essentially all they do. . .
Of course the systems are probabilistic; not every LLM will produce a hallucination every time. But the problem is not going away; OpenAI’s recent o3 actually hallucinates <https://www.theleftshift.com/openai-admits-newer-models-hallucinate-even-more/> more<https://www.theleftshift.com/openai-admits-newer-models-hallucinate-even-more/> than some its predecesssors<https://www.theleftshift.com/openai-admits-newer-models-hallucinate-even-more/>.
The chronic problem with creating fake citations in research papers<https://www.forbes.com/sites/larsdaniel/2025/01/29/the-irony-ai-experts-testimony--collapses-over-fake-ai-citations/> and faked cases in legal briefs<https://www.rawstory.com/lindell-dominion-2671843170/> is a manifestation of the same problem; LLMs correctly “model” the structure of academic references, but often make up titles, page numbers, journals and so on — once again failing to sanity check their outputs against information (in this case lists of references) that are readily found on the internet. So to is the rampant problem with numerical errors in financial reports<https://www.vals.ai/benchmarks/finance_agent-04-22-2025>, documented in a recent benchmark.
Just how bad is it? One recent study showed rates of hallucinations of between 15% and 60% <https://research.aimultiple.com/ai-hallucination/> across various models on a benchmark of 60 questions that were easily verifiable relative to easily found CNN source articles that were directly supplied in the exam. Even the best performance (15% hallucination rate) is, relative to an open-book exam with sources supplied, pathetic. That same study reports that, “According to Deloitte, 77% of businesses who joined the study are concerned about AI hallucinations”.
If I can be blunt, it is an absolute embarrassment that a technology that has collectively cost about half a trillion dollars can’t do something as basic as (reliably) check its output against wikipedia or a CNN article that is handed on a silver plattter. But LLMs still cannot - and on their own may never be able to — reliably do even things that basic.”
Cheers

Brian

--

School of Computing, Newcastle University, 1 Science Square, Newcastle upon Tyne, NE4 5TG
EMAIL = Brian.Randell at ncl.ac.uk<mailto:Brian.Randell at ncl.ac.uk>   PHONE = +44 (0)786 7805578
URL =  https://www.ncl.ac.uk/computing/staff/profile/brianrandell.html

From: Adam Hyland <achyland at uw.edu>
Date: Tuesday, 6 May 2025 at 23:15
To: Brian Randell <brian.randell at newcastle.ac.uk>
Cc: Sigcis <members at sigcis.org>
Subject: Re: [SIGCIS-Members] Perplexity

⚠ External sender. Take care when opening links or attachments. Do not provide your login details.
Thanks for your return to this topic. The perplexity thread you shared is private and unfortunately the pdf generated obscures the answer with a UI element. I’ve asked the same question to Perplexity using their “research” mode which more profligately uses computing resources to give a more detailed answer. The result is here (
https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-oxneW37nS9.bAiYvfV3QVw) and should be available to anyone with the URL.

I’m familiar with perplexity because the University of Washington pays for “pro” access which makes it both partially difficult and morally dubious to proscribe its use in the classroom.

One broad point is probably obvious to everyone but bears repeating: there is much more difference in capability between any of these agents today versus a year ago than there is among the agents themselves. They all show a dramatic increase in capability to “understand” queries and generate cogent responses. If someone on this thread last posed one of their exam questions to ChatGPT last year they ought to try again today, and again in another month.

-Adam

Adam Hyland (he/him)
adampunk.com<https://adampunk.com/>
UW HCDE PhD Student

On Tue, May 6, 2025 at 8:57 AM Brian Randell via Members <members at lists.sigcis.org<mailto:members at lists.sigcis.org>> wrote:
Hi:

A while ago I and colleagues published a short critique in the Annals of the History of Computing of ChatGPT, based on its performance on some questions about Percy Ludgate.

I’ve recently been trying (the free version of) Perplexity, the AI search (or more exactly question-answering) system.

Perplexity itself claims:

Here's what makes Perplexity different

Answers that are accurate and always cited
We continuously search the internet and identify the best sources, from academic research
to Reddit threads, to provide the perfect answer to any question.

Citations in every response
Every answer uses cited sources to provide a more accurate and comprehensive answer.
If you want to dig deeper, just click the link to the source.

See its brilliant answer to the question “Did Percy Ludgate's work have any impact?”:

https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-2esemV.BRuCo0Q7O_4AxfA<https://urldefense.com/v3/__https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-2esemV.BRuCo0Q7O_4AxfA__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1l7QbFBQe$>

The web version limits the number of questions per day – so far the iPhone App hasn’t.

I assume I’m not alone here in trying Perplexity, but I don’t recall any previous comment about it in SIGCIS.

However, the Wikipedia article about it
  https://en.wikipedia.org/wiki/Perplexity_AI<https://urldefense.com/v3/__https://en.wikipedia.org/wiki/Perplexity_AI__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1l_ByERqJ$>
is a little sobering, regarding its alleged copyright violations, and failure to respect the robots.txt web-crawling standards.

Cheers

Brian
_______________________________________________
This email is relayed from members at sigcis.org<http://sigcis.org/>, the email discussion list of SHOT SIGCIS. Opinions expressed here are those of the member posting and are not reviewed, edited, or endorsed by SIGCIS. The list archives are at https://urldefense.com/v3/__http://lists.sigcis.org/pipermail/members-sigcis.org/__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1lxruqEdl$  and you can change your subscription options at https://urldefense.com/v3/__http://lists.sigcis.org/listinfo.cgi/members-sigcis.org__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1lwGqTJnR$
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sigcis.org/pipermail/members-sigcis.org/attachments/20250507/d66e52af/attachment-0001.htm>