[SIGCIS-Members] Perplexity, or ML tools in historical research

Wed May 7 11:32:26 PDT 2025

For those who are interested, my co-author and I contributed a piece to
_Management and Organizational History_ that looks at some of the
methodological opportunities at the intersection of ML and history:

Villamor Martin, M., Kirsch, D. A., & Prieto-Nanez, F. (2023). The promise
of machine-learning-driven text analysis techniques for historical
research: topic modeling and word embedding. *Management & Organizational
History*. https://www.tandfonline.com/doi/abs/10.1080/17449359.2023.2181184

- david k.

On Wed, May 7, 2025 at 7:44 AM James Cortada via Members <
members at lists.sigcis.org> wrote:

> If I may add, I am personally being driven by two realities in my work.
>
> First, AI tools are still not as reliable as sources of
> accurate information and logic grounded in realities as all the hype would
> suggest these are.  That is my reality in 2025-2026.  If all that changes
> over time then I will revisit how to use AI, but not until then, just as in
> the 1970s we worried about the accuracy and relevance of children using
> handheld H-P calculators, etc.
>
> My second reality is that I ultimately and always am personally
> responsible for whatever I publish--not a publisher, not an editor, not
> another historian--but me.  It is my ethical and moral responsibility to
> offer up my works, my thoughts, my best efforts adhering to the highest
> standards of scholarship that I can muster.  So, no matter what tools I
> use, I as a human ultimately am responsible.  If I were to invent a
> footnote, inject some hallucination nonsense purposefully into something I
> write, or even unknowingly, shame on me because these actions would be
> dishonest.  My second reality is increasingly of concern to me as I slowly
> evolve my research and writing to topics that are not as monographically
> narrowly focused, moving toward more tacit knowledge topics where logic,
> facts, and boundaries with truth become increasingly fuzzy and perhaps less
> knowledgeable and describable.  For such topics I have to eschew the use of
> AI for the foreseeable future and rely on my aging brain but absolutely
> current release of Microsoft Word to go about my work.
>
> I think the entire IEEE community is working through similar topics.  Read
> recent issues of *Computer, Edge, *and *Spectrum*--as I do as part of the
> benefits of my expensive IEEE membership--and you will see that Troy is
> right (so far) about it not worth the cost of using such tools yet, but by
> implication, too, that we need to learn about this technology as it
> unfolds.  Most of the history journals and book publishers I work with are
> also just now beginning to learn even how to spell AI.  I consider our
> discussion as members of *Annals* about the subject ahead of those being
> held elsewhere and so I find it most interesting for which I thank all of
> you and Troy.
>
> Jim
>
> On Wed, May 7, 2025 at 3:46 AM Troy Astarte via Members <
> members at lists.sigcis.org> wrote:
>
>> Hi folks
>>
>> The piece Brian mentions having written is B. Randell and B. Coghlan,
>> "ChatGPT's Astonishing Fabrications About Percy Ludgate" in *IEEE Annals
>> of the History of Computing*, vol. 45, no. 02, pp. 71-72, April-June
>> 2023, doi: 10.1109/MAHC.2023.3272989. (it appears to be Open Access at the
>> Xplore link
>> https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10148832).
>>
>> Personally, my view is that no matter how good the answers these machine
>> learning systems give us, *they’re not worth the cost. *Quite apart from
>> the well-documented energy/water/environmental impact (which must surely be
>> our most important criterion when making any decision, these days), and the
>> social impacts (including educational, where ML has hit me personally the
>> hardest), from a scholarly perspective we have a duty to be critical of our
>> sources—and when they come from a LLM we simply can’t interrogate them
>> properly.
>>
>> As the editor of *Annals*, I need to follow the IEEE’s publication
>> guidelines on the use of machine learning systems in the production of
>> manuscripts: it is mandatory that authors declare if they have ML-generated
>> text, but not if they have used a tool to check/correct their spelling or
>> grammar. Clearly this is a bit of a strange distinction: if an author
>> accepts a Grammarly-rewrite of a sentence, aren’t they including
>> ML-generated text?
>>
>> What I have not found *so far *is whether there is any requirement to
>> declare the use of ML-based tools in their research. Again, there may be
>> grey areas—I’m pretty sure Preview on my Mac is automatically OCR’ing text
>> in all my scanned PDFs—that’s probably ML-based but I can’t do much about
>> it—but it feels to me like there there is a distinction between that and
>> asking a chatbot to do your research for you.
>>
>> As a community of scholars we will have to set our own standards here.
>> While my personal view is always going to be anti-ML, I will do my best to
>> listen openly and make sure the community is served as best as possible by
>> *Annals*.
>>
>> Best,
>>
>> Dr. Troy Kaighin Astarte (they/them / nhw)
>>
>> I often dictate messages due to motor disability; please forgive any
>> oddities resulting.
>>
>> Lecturer, Computer Science / Darlithydd, Cyfrifiadureg
>> Swansea University / Prifysgol Abertawe
>> Editor-in-Chief / Prif Olygydd, *IEEE Annals of the History of Computing*
>>
>> For students: my drop in hours are on the Intranet
>> <https://fse-intranet.swan.ac.uk/intranet/staff_officehours?selected_staff_id=203842> (office
>> CoFo 407)
>> I fyfyrwyr: mae fy oriau swyddfa ar y fewnrwyd.
>> Meeting booking: via Office Booking
>> <https://outlook.office.com/bookwithme/user/8e101a47e22e4af793d033901758d0e4@Swansea.ac.uk/meetingtype/SVRwCe7HMUGxuT6WGxi68g2?anonymous&ep=mlink>
>> .
>> Zoom office: https://swanseauniversity.zoom.us/my/t.k.astarte
>>
>>
>> Every email has a cost to the climate. Please think before sending short
>> emails.
>> Mae gan bob e-bost gost i’r hinsawdd. Meddyliwch cyn i chi anfon e-byst
>> byr.
>>
>> Yes, I have switched my default message font. Do you like it?
>>
>> On 6 May 2025, at 23:39, Ceruzzi, Paul via Members <
>> members at lists.sigcis.org> wrote:
>>
>> *CAUTION:* This email originated from outside of Swansea University. Do
>> not click links or open attachments unless you recognise the sender and
>> know the content is safe.
>>
>> *RHYBUDD:* Daeth yr e-bost hwn o'r tu allan i Brifysgol Abertawe.
>> Peidiwch â chlicio ar atodiadau neu agor atodiadau oni bai eich bod chi'n
>> adnabod yr anfonwr a'ch bod yn gwybod bod y cynnwys yn ddiogel.
>> *CAUTION:* This email originated from outside of Swansea University. Do
>> not click links or open attachments unless you recognise the sender and
>> know the content is safe.
>>
>> *RHYBUDD:* Daeth yr e-bost hwn o'r tu allan i Brifysgol Abertawe.
>> Peidiwch â chlicio ar atodiadau neu agor atodiadau oni bai eich bod chi'n
>> adnabod yr anfonwr a'ch bod yn gwybod bod y cynnwys yn ddiogel.
>> Thank you for bringing up this thread. Agree that there has been a great
>> advance in the past year, of which I was unaware.  Not as easy as it once
>> was to get the programs to hallucinate.
>>
>> I was chatting with a Google employee the other day. He works in the
>> Reston, Virginia office and was familiar with my book on Tysons Corner
>> ("Internet Alley"). I told him of my Substack posts on the concentration of
>> Data Centers in neighboring Ashburn. He suggested that I write a second
>> edition of the book to cover this topic. I told him that as a retiree I no
>> longer have the energy or stamina to write another book. He said
>> (paraphrasing): "Let Google Gemini write it for you. It will do an
>> excellent job. No problem, as long as you supervise it, and acknowledge how
>> the Second Edition was written."
>>
>> As Jack Benny once said, "I'm thinking it over."
>>
>> Paul Ceruzzi
>> Substack.com/@paulceruzzi <http://substack.com/@paulceruzzi>
>> ------------------------------
>> *From:* Members <members-bounces at lists.sigcis.org> on behalf of Adam
>> Hyland via Members <members at lists.sigcis.org>
>> *Sent:* Tuesday, May 6, 2025 6:15 PM
>> *To:* Brian Randell <brian.randell at newcastle.ac.uk>
>> *Cc:* Sigcis <members at sigcis.org>
>> *Subject:* Re: [SIGCIS-Members] Perplexity
>>
>> *External Email - Exercise Caution*
>> Thanks for your return to this topic. The perplexity thread you shared is
>> private and unfortunately the pdf generated obscures the answer with a UI
>> element. I’ve asked the same question to Perplexity using their “research”
>> mode which more profligately uses computing resources to give a more
>> detailed answer. The result is here (
>>
>> https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-oxneW37nS9.bAiYvfV3QVw)
>> and should be available to anyone with the URL.
>>
>> I’m familiar with perplexity because the University of Washington pays
>> for “pro” access which makes it both partially difficult and morally
>> dubious to proscribe its use in the classroom.
>>
>> One broad point is probably obvious to everyone but bears repeating:
>> there is much more difference in capability between any of these agents
>> today versus a year ago than there is among the agents themselves. They all
>> show a dramatic increase in capability to “understand” queries and generate
>> cogent responses. If someone on this thread last posed one of their exam
>> questions to ChatGPT last year they ought to try again today, and again in
>> another month.
>>
>> -Adam
>>
>> Adam Hyland (*he/him)*
>> adampunk.com
>> UW HCDE PhD Student
>>
>>
>> On Tue, May 6, 2025 at 8:57 AM Brian Randell via Members <
>> members at lists.sigcis.org> wrote:
>>
>> Hi:
>>
>> A while ago I and colleagues published a short critique in the Annals of
>> the History of Computing of ChatGPT, based on its performance on some
>> questions about Percy Ludgate.
>>
>>
>> I’ve recently been trying (the free version of) Perplexity, the AI search
>> (or more exactly question-answering) system.
>>
>>
>> Perplexity itself claims:
>>
>>
>> Here's what makes Perplexity different
>>
>>
>> Answers that are accurate and always cited
>> We continuously search the internet and identify the best sources, from
>> academic research
>> to Reddit threads, to provide the perfect answer to any question.
>>
>>
>> Citations in every response
>> Every answer uses cited sources to provide a more accurate and
>> comprehensive answer.
>> If you want to dig deeper, just click the link to the source.
>>
>>
>> See its brilliant answer to the question “Did Percy Ludgate's work have
>> any impact?”:
>>
>>
>>
>> https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-2esemV.BRuCo0Q7O_4AxfA
>> <https://urldefense.com/v3/__https://www.perplexity.ai/search/did-percy-ludgate-s-work-have-2esemV.BRuCo0Q7O_4AxfA__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1l7QbFBQe$>
>>
>>
>> The web version limits the number of questions per day – so far the
>> iPhone App hasn’t.
>>
>> I assume I’m not alone here in trying Perplexity, but I don’t recall any
>> previous comment about it in SIGCIS.
>>
>> However, the Wikipedia article about it
>>   https://en.wikipedia.org/wiki/Perplexity_AI
>> <https://urldefense.com/v3/__https://en.wikipedia.org/wiki/Perplexity_AI__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1l_ByERqJ$>
>> is a little sobering, regarding its alleged copyright violations, and
>> failure to respect the robots.txt web-crawling standards.
>>
>>
>> Cheers
>>
>>
>> Brian
>> _______________________________________________
>> This email is relayed from members at sigcis.org, the email discussion
>> list of SHOT SIGCIS. Opinions expressed here are those of the member
>> posting and are not reviewed, edited, or endorsed by SIGCIS. The list
>> archives are at
>> https://urldefense.com/v3/__http://lists.sigcis.org/pipermail/members-sigcis.org/__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1lxruqEdl$
>> and you can change your subscription options at
>> https://urldefense.com/v3/__http://lists.sigcis.org/listinfo.cgi/members-sigcis.org__;!!K-Hz7m0Vt54!gyxp0x4-UgdlqkBkNCRifXnXGAv6Y1rWzqMDvgDAx9lcE0e3xttiUQD_Zgg0jayzgaPV_LCgiwp1lwGqTJnR$
>>
>> _______________________________________________
>> This email is relayed from members at sigcis.org, the email discussion
>> list of SHOT SIGCIS. Opinions expressed here are those of the member
>> posting and are not reviewed, edited, or endorsed by SIGCIS. The list
>> archives are at http://lists.sigcis.org/pipermail/members-sigcis.org/
>> and you can change your subscription options at
>> http://lists.sigcis.org/listinfo.cgi/members-sigcis.org
>>
>>
>> _______________________________________________
>> This email is relayed from members at sigcis.org, the email discussion
>> list of SHOT SIGCIS. Opinions expressed here are those of the member
>> posting and are not reviewed, edited, or endorsed by SIGCIS. The list
>> archives are at http://lists.sigcis.org/pipermail/members-sigcis.org/
>> and you can change your subscription options at
>> http://lists.sigcis.org/listinfo.cgi/members-sigcis.org
>>
>
>
> --
> James W. Cortada
> Senior Research Fellow
> Charles Babbage Institute
> University of Minnesota
> jcortada at umn.edu
> 608-274-6382
> _______________________________________________
> This email is relayed from members at sigcis.org, the email discussion
> list of SHOT SIGCIS. Opinions expressed here are those of the member
> posting and are not reviewed, edited, or endorsed by SIGCIS. The list
> archives are at http://lists.sigcis.org/pipermail/members-sigcis.org/ and
> you can change your subscription options at
> http://lists.sigcis.org/listinfo.cgi/members-sigcis.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sigcis.org/pipermail/members-sigcis.org/attachments/20250507/a2b6f067/attachment.htm>