[SIGCIS-Members] Perplexity

Fri May 23 14:39:10 PDT 2025

To your second point, what if the old Irish newspapers were no longer
around to check?  We have 90 years of experience of librarians filming and
digitizing newspapers, magazines, etc then throwing away the originals, at
least in the USA.  That was especially so with American librarians with
newspapers beginning in the 1930s when they microfilmed them.  Hate to
criticize them, but they sometimes did not film all the issues of a
newspaper, sometimes missed pages, or they came out blurry.  Fast forward
to the future if AI invents citations to stuff that no longer exists in
their original. I encountered a bit of that similar result with the Google
book project a few years ago.  I worry about this possibility.  How do we
address that problem?

Meanwhile, I, an old guy who grew up pre-AI, will still look at hard copies
or trusted online versions of something (e.g., a journal) and keep hard
copies of anything I cite off the Internet (e.g., images) before dropping
that info into an endnote, because I don't want to be accused of faking
stuff.  It's hard enough working with real facts and sources.  I suggest
others follow a similar path.  Call me old fashioned, but it has not hurt
my productivity so far.  Thanks for posting your thoughts, Brian.  Jim
Cortada

On Fri, May 23, 2025 at 4:22 PM Brian Randell via Members <
members at lists.sigcis.org> wrote:

> Hi:
>
>
>
> My colleague Brian Coghlan added two further points about the dangers AI
> hallucinations/fabrications pose to historians who are trying to use
> Generative AI tools to help their investigations:
>
>
>
> First, our own investigation very strongly implied the "AI error rate" was
> inversely proportional to the AI
>
> knowledge of the query topic. So one *very easy* countermeasure would be
> for AI tools to calculate their
>
> knowledge of the topic and inversely adapt their response to that (be more
> cautious, not provide citations,
>
> warn users their knowledge is weak, etc). *THAT* would make me somewhat
> hopeful ...
>
> Secondly, "but then I found" fake citations -- these are a *REALLY*
> serious issue, especially to very old
>
> sources, e.g. the 100+ years old Irish Times citations I'd to check -- it
> took a full 2 days of my time plus an
>
> archivist to check them, and he spent the previous day physically
> retrieving the papers from a remote
>
> archive (and presumably another day returning them) -- I got this effort
> gratis because of the novelty of
>
> our investigation at that time, but it'd be unlikely to be repeated now
> without funding to pay for staff effort.
>
> Imagine some years from now when huge numbers of these fake citations have
> to be checked, simply
>
> because one might be a previously unknown real citation. Who on earth is
> going to fund that?
>
>
>
> Cheers
>
>
>
> Brian Randell
>
>
>
> --
>
>
>
> School of Computing, Newcastle University, 1 Science Square, Newcastle
> upon Tyne, NE4 5TG
>
> EMAIL = Brian.Randell at ncl.ac.uk   PHONE = +44 (0)786 7805578
>
> URL =  https://www.ncl.ac.uk/computing/staff/profile/brianrandell.html
>
>
>
>
>
>
>
> *From: *Brian Randell <brian.randell at newcastle.ac.uk>
> *Date: *Wednesday, 7 May 2025 at 14:52
> *To: *Adam Hyland <achyland at uw.edu>
> *Cc: *Sigcis <members at sigcis.org>
> *Subject: *Re: [SIGCIS-Members] Perplexity
>
> Hi Adam:
>
>
>
> Thanks – I hadn’t realised that URL links to Perplexity’s answers aren’t
> transferable.
>
>
>
> It is certainly the case that Perplexity (which is based on ChatGPT4 I
> believe) is much more “knowledgeable” about Percy Ludgate than ChatGPT was
> – not least because it has ingested subsequent additions to the Internet.
>
>
>
> Indeed, given the huge amount of effort and money being poured into LLM
> research we surely should expect the latest models to be significantly
> improved.
>
>
>
> And Perplexity’s answer to my question “Are the references that
> Perplexity provides always real ones” was very reasonable, one might even
> say “thoughtful”, though its cautionary comments have of course to be
> applied to this answer itself – a nice example of recursion.
>
>
>
> However, the NY Times had a very interesting, and surely worrying (for the
> Artificial Intelligentsia, especially) piece yesterday entitled: “A.I. Is
> Getting More Powerful, but Its Hallucinations Are Getting Worse: A new
> wave of “reasoning” systems from companies like OpenAI is producing
> incorrect information more often. Even the companies don’t know why.”
>
>
>
> I hope this link to it will work:
>
>
> https://www.nytimes.com/2025/05/05/technology/ai-hallucinations-chatgpt-google.html?unlocked_article_code=1.E08.mKzL.qenddVF5UQVq&smid=url-share
>
>
>
> And on the subject of “hallucinations” (or rather “fabrications”), I also
> strongly recommend Gary Marcus’s very recent piece:
>
>
>
>   Why DO large language models hallucinate?
> <https://garymarcus.substack.com/p/why-do-large-language-models-hallucinate>
>
> (
> https://garymarcus.substack.com/p/why-do-large-language-models-hallucinate
> )
>
>
>
> Let me end by quoting from it:
>
>
>
> “Because LLMS statistically mimic the language people have used, they
> often *fool people *into thinking that they operate like people. But they
> don’t operate like people. They don’t, for example, ever *fact check* (as
> humans sometimes, when well motivated, do). They *mimic the kinds of
> things of people say in various contexts*. And that’s essentially all
> they do. . .
>
> Of course the systems are probabilistic; not every LLM will produce a
> hallucination every time. But the problem is not going away; OpenAI’s
> recent o3 actually hallucinates
> <https://www.theleftshift.com/openai-admits-newer-models-hallucinate-even-more/>*more
> <https://www.theleftshift.com/openai-admits-newer-models-hallucinate-even-more/>*
>  than some its predecesssors
> <https://www.theleftshift.com/openai-admits-newer-models-hallucinate-even-more/>
> .
>
> The chronic problem with creating fake citations in research papers
> <https://www.forbes.com/sites/larsdaniel/2025/01/29/the-irony-ai-experts-testimony--collapses-over-fake-ai-citations/>
>  and faked cases in legal briefs
> <https://www.rawstory.com/lindell-dominion-2671843170/> is a
> manifestation of the same problem; LLMs correctly “model” the structure of
> academic references, but often make up titles, page numbers, journals and
> so on — once again failing to sanity check their outputs against
> information (in this case lists of references) that are readily found on
> the internet. So to is the rampant problem with numerical errors in
> financial reports
> <https://www.vals.ai/benchmarks/finance_agent-04-22-2025>, documented in
> a recent benchmark.
>
> Just how bad is it? One recent study showed rates of hallucinations of
> between 15% and 60%  <https://research.aimultiple.com/ai-hallucination/>across
> various models on a benchmark of 60 questions that were *easily
> verifiable relative to easily found CNN source articles that were directly
> supplied in the exam*. Even the best performance (15% hallucination rate)
> is, relative to an open-book exam with sources supplied, pathetic. That
> same study reports that, “According to Deloitte, 77% of businesses who
> joined the study are concerned about AI hallucinations”.
>
> If I can be blunt, it is an absolute embarrassment that a technology that
> has collectively cost about half a trillion dollars can’t do something as
> basic as (reliably) check its output against wikipedia or a CNN article
> that is handed on a silver plattter. But LLMs still cannot - and on their
> own may never be able to — reliably do even things that basic.”
>
> Cheers
>
>
>
> Brian
>
>
>
> --
>
>
>
> School of Computing, Newcastle University, 1 Science Square, Newcastle
> upon Tyne, NE4 5TG
>
> EMAIL = Brian.Randell at ncl.ac.uk   PHONE = +44 (0)786 7805578
>
> URL =  https://www.ncl.ac.uk/computing/staff/profile/brianrandell.html
>
>
>
>
>
>
>
> *From: *Adam Hyland <achyland at uw.edu>
> *Date: *Tuesday, 6 May 2025 at 23:15
> *To: *Brian Randell <brian.randell at newcastle.ac.uk>
> *Cc: *Sigcis <members at sigcis.org>
> *Subject: *Re: [SIGCIS-Members] Perplexity
>
> ⚠ External sender. Take care when opening links or attachments. Do not
> provide your login details.
>
> Thanks for your return to this topic. The perplexity thread you shared is
> private and unfortunately the pdf generated obscures the answer with a UI
> element. I’ve asked the same question to Perplexity using their “research”
> mode which more profligately uses computing resources to give a more
> detailed answer. The result is here (
>
>
> https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-oxneW37nS9.bAiYvfV3QVw)
> and should be available to anyone with the URL.
>
>
>
> I’m familiar with perplexity because the University of Washington pays for
> “pro” access which makes it both partially difficult and morally dubious to
> proscribe its use in the classroom.
>
>
>
> One broad point is probably obvious to everyone but bears repeating: there
> is much more difference in capability between any of these agents today
> versus a year ago than there is among the agents themselves. They all show
> a dramatic increase in capability to “understand” queries and generate
> cogent responses. If someone on this thread last posed one of their exam
> questions to ChatGPT last year they ought to try again today, and again in
> another month.
>
>
>
> -Adam
>
>
> Adam Hyland (*he/him)*
>
> adampunk.com
>
> UW HCDE PhD Student
>
>
>
>
>
> On Tue, May 6, 2025 at 8:57 AM Brian Randell via Members <
> members at lists.sigcis.org> wrote:
>
> Hi:
>
>
>
> A while ago I and colleagues published a short critique in the Annals of
> the History of Computing of ChatGPT, based on its performance on some
> questions about Percy Ludgate.
>
>
>
> I’ve recently been trying (the free version of) Perplexity, the AI search
> (or more exactly question-answering) system.
>
>
>
> Perplexity itself claims:
>
>
>
> Here's what makes Perplexity different
>
>
>
> Answers that are accurate and always cited
>
> We continuously search the internet and identify the best sources, from
> academic research
>
> to Reddit threads, to provide the perfect answer to any question.
>
>
>
> Citations in every response
>
> Every answer uses cited sources to provide a more accurate and
> comprehensive answer.
>
> If you want to dig deeper, just click the link to the source.
>
>
>
> See its brilliant answer to the question “Did Percy Ludgate's work have
> any impact?”:
>
>
>
>
> https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-2esemV.BRuCo0Q7O_4AxfA
> <https://urldefense.com/v3/__https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-2esemV.BRuCo0Q7O_4AxfA__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1l7QbFBQe$>
>
>
>
> The web version limits the number of questions per day – so far the iPhone
> App hasn’t.
>
>
>
> I assume I’m not alone here in trying Perplexity, but I don’t recall any
> previous comment about it in SIGCIS.
>
>
>
> However, the Wikipedia article about it
>
>   https://en.wikipedia.org/wiki/Perplexity_AI
> <https://urldefense.com/v3/__https://en.wikipedia.org/wiki/Perplexity_AI__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1l_ByERqJ$>
>
> is a little sobering, regarding its alleged copyright violations, and
> failure to respect the robots.txt web-crawling standards.
>
>
>
> Cheers
>
>
>
> Brian
>
> _______________________________________________
> This email is relayed from members at sigcis.org, the email discussion
> list of SHOT SIGCIS. Opinions expressed here are those of the member
> posting and are not reviewed, edited, or endorsed by SIGCIS. The list
> archives are at
> https://urldefense.com/v3/__http://lists.sigcis.org/pipermail/members-sigcis.org/__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1lxruqEdl$
> and you can change your subscription options at
> https://urldefense.com/v3/__http://lists.sigcis.org/listinfo.cgi/members-sigcis.org__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1lwGqTJnR$
>
> _______________________________________________
> This email is relayed from members at sigcis.org, the email discussion
> list of SHOT SIGCIS. Opinions expressed here are those of the member
> posting and are not reviewed, edited, or endorsed by SIGCIS. The list
> archives are at http://lists.sigcis.org/pipermail/members-sigcis.org/ and
> you can change your subscription options at
> http://lists.sigcis.org/listinfo.cgi/members-sigcis.org
>

-- 
James W. Cortada
Senior Research Fellow
Charles Babbage Institute
University of Minnesota
jcortada at umn.edu
608-274-6382
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sigcis.org/pipermail/members-sigcis.org/attachments/20250523/3fab7a16/attachment.htm>