Greg Eiden has an interesting idea we thought might be interesting/terrifying/amusing…
What might ChatGPT or other such tools derive from an analysis of the Samizdata blog post archives? To focus this on something timely: ask it to do the analysis and propose policy ideas for the new Trump Administration, ideas consistent with the perspectives and ideas of the Samizdatista community. Or at least the broad consensus of that community about liberty and limited, less regulatory government.
There are other ways this might be posed to an AI, e.g., “which proposed policies of the Trump administration are consistent with the broad governance perspectives at Samizdata.net and why? Which are not and why?” How would you set up such an analysis to have the model have the best chance at producing something useful?
I am not interested in an experiment that assesses if ChatGPT is a good model for this or a good model in general–I want a good analysis! If the product of such an inquiry was not compelling to enough samizdatistas, if it did not pass a laugh test, we stop there.
Maybe there is a better choice of large language model than ChatGPT; maybe ChatGPT is not optimized for this sort of analysis but other models are? If you have some knowledge here, please chime in!
Perhaps ollama because you aren’t feeding into the OpenAI machine (you can run ollama locally). Load the documents with retrieval augmented generation and it should be good.
FWIW, this is a great tutorial on how to do it.
Everything is on the training data.
Do you include comments? In which case there are a couple of people who are going to be well over-represented in their views. Not all of us share some commenters obsession with “fiat money”, but by weight of comment it would the #1 issue.
Do you include links? In which case links to concepts you intended to ridicule will be treated identically as those with which you agree.
In the end, what would AI add? It only parrots the training data anyway. It can’t think. It won’t come up with anything interesting.
Thanks, Fraser. Re “Load the documents…”, I don’t have the documents. I was hoping this could be done automatically, but I guess that implies I use a program that would scour Samizdata’s archives for all posts and retrieve them in a format that enables automated feeding into the LLM. I don’t have the skill to do that, even if it’s allowed, ethical, etc. I suppose I could pick 10-20 posts and load those? Thanks again for the interest.
@Greg, Maybe Perry will be curious enough to give you a dump from his database.
Fraser Orr, this might be a good time for people who really don’t get AI yet – well, me – to ask a question.
Would you expect that this would result in a detailed summary of the average Samizdata view on everything? Or does it get (much) more sophisticated than that?
@bobby b
Would you expect that this would result in a detailed summary of the average Samizdata view on everything? Or does it get (much) more sophisticated than that?
Maybe? I think it might produce something like that. However, it is important to remember these systems work in layers. There is the core training data which is what the model provides. That core training data is essentially the part of the model that “understands” human language and is primed with a very large amount of general knowledge. You then layer on top of that a set of custom data (called RAG data) that augments that basic knowledge. Think of the core training as your learning through high school, and the RAG as your learning in college. Finally there is the context. This is the data the chat engine accumulates from your recent chat. If you have used ChatGPT when you start a new chat it creates a tab for you. The context is what is in that conversation, and it independent of everything else. These contexts are fairly short, I think usually about 4000 tokens — where a token is roughly akin to a syllable in a word. So it is more akin to the discussion you were having with your friends in the bar.
So if Greg follows my suggestion he’d put Samizdata in as RAG. To further my inexact analogy, for sure what you learn in college can change your perspective, but it is hard to shake off those foundational things drilled into you from your early years and school experience.
I don’t think the “birth through high school”/college/”bar chat” thing is a perfect analogy, but is maybe good enough as a general guide. And whether you think “understand” is a justifiable description of what it is doing is largely a matter of what you think the word “understand” means. So YMMV.
However, I think Greg’s experiment sounds really interesting. I’d love to know the results.
Chester Draws @11:04pm, yes, I’d thought about including comments and/or links but thought they should be excluded for exactly the reasons you cite.
Fraser Orr @11:27pm, good suggestion! Perry? Seriously, Perry, maybe we can get away with a partial dump of data? If we stick with my original premise, to have ChatGPT come up with policy suggestions for Trump that are broadly consistent with the Samizdata community, then that might suggest ways to filter what we pull from the database (doesn’t everyone filter the data they feed to an AI to ensure they get the outcome they want!? 🙂 ). I also appreciate that you can probably just write down a top 10 list of ideas for Mr. Trump without an AI or even consulting your own database!
While I know I suggested this, I’m starting to think the ChatGPT answer might just elicit a collective “Duh!” from the Samizdatistas! Then again, a good rule in science is that if you think you know the outcome of an experiment that has not been done, but it concerns important issues, you should probably do the experiment! As long as it’s not too expensive. And while I was typing this Fraser Orr said he thinks this might be “interesting” (thanks!) so maybe the “Duh” is not a foregone conclusion!
I find it really surprising what these AI models produce. They are producing data that nobody told them through some sort of “reasoning” that I think we don’t really understand fully. And FWIW I don’t think they have nearly as much bias as some people have claimed. For example, ask ChatGPT what Donald Trump’s top policy priorities are and you’ll get a pretty decent answer. My favorite recently was when I asked it “Explain the physics used in the Road Runner cartoon series, and how it is different than the standard model of physics used in the real world.” And it gave a remarkably good answer. Or “in what way is Lord of the Rings similar to Bugs Bunny.” Again a remarkably “insightful” answer. How it could possibly know enough to answer these questions is entirely beyond me.
I’m travelling today so not much I can do.
Fraser,
Yes, it is quite remarkable at times. I was deeply skeptical until I used it. I was amazed at the images Things like DALL-E 3 can produce. It does have quirks though. From my experience with text (mainly Quora’s Poe) I have found that what it excels at is giving a precis of a complicated issue. It is very useful to ask about computer configurations etc. I haven’t asked any questions that weren’t specific and essentially factual in nature (as oppossed to asking an opinion).
As someone who knows enough about AI to know that I don’t know enough about AI, would it be possible to get around Chester Draws’ twin problems of over-represented views and links that we disagree with by asking ChatGPT (or whatever you are using) a question phrased as “What policy ideas would (INSERT HISTORICAL FIGURE BROADLY ALIGNED WITH SAMADATISTS HERE) derive from analysis of samizdata.net?”
Historical Figures could include Ludwig von Mises, Milton Friedman, Adam Smith, Margaret Thatcher etc.
As an interesting counterpoint, could you try the same test, but with the likes of Marx, Lenin, Hitler?
I sense a Hitler rant incoming!
Gustave,
I suppose we could easily ask AI.
I asked Poe what C S Lewis might think about The Harry Potter books… Here’s the response:
C.S. Lewis, known for his own fantasy works like “The Chronicles of Narnia,” would likely have a nuanced view of the Harry Potter series. Here are some points he might consider:
Imagination and Fantasy: Lewis valued the power of imagination in literature. He might appreciate J.K. Rowling’s ability to create a richly detailed world that encourages readers to explore complex themes through fantasy.
Moral Themes: Lewis often infused his writing with moral and theological lessons. He might recognize the ethical dilemmas faced by characters in Harry Potter, such as friendship, sacrifice, and the battle between good and evil.
Christian Allegory: While Lewis’s own works have clear Christian allegorical elements, he might critique or analyze the absence of overt religious themes in Harry Potter. He could see the series as a different approach to exploring moral truths without explicit religious context.
Cultural Impact: Lewis might reflect on the cultural phenomenon of the Harry Potter books and their impact on children’s literature, considering their role in encouraging reading and imagination among young audiences.
Criticism of Magic: Given his views on the use of magic in fiction, Lewis may have reservations about the portrayal of magic in Harry Potter. He often warned against the dangers of misusing magical themes, though he might also see it as a tool for storytelling rather than a literal endorsement.
Fair warning to Perry and all, if Perry is willing to send me some data (give access to some/all of his database), I’m not sure when I’ll have a result for you. I will first need to figure the mechanics of running an AI model, practice a bit (maybe?) to make sure I’ve got things optimized (I assume that’s a thing in running this model; it certainly is with all other models!), and then run it! I’m still working (a bit), travelling, etc. so this will take me at least weeks, not days. I know things tend to move faster around here so sorry, in advance, for that delay.
and to Gustave @12:46pm, it’s spelled “Hiteler”! “Heil me, baby!” (h/t Taika Waititi!)
I have not used Chat GPT – but many people who have used it tell me it has a built-in leftist bias.
So if you are going to use “Artificial Intelligence” (although I do not believe these things are intelligences – as they do not have free will – moral responsibility – personhood) do not use Chat GPT.
It would be like using the Google search engine (which also has a leftist bias), or trusting Amazon to make a faithful adaptation of the works of Tolkien (or any other cultural conservative).
@WindyPants
As someone who knows enough about AI to know that I don’t know enough about AI, would it be possible to get around Chester Draws’ twin problems of over-represented views
Well you can do that directly with prompt engineering. “What does the samizdat community think of Trump’s immigration policy, but please ignore that idiot Fraser Orr when considering the answer.”
As to historical figures, I mean you can do that right now without any supplemental data.
“What would Milton Friedman have thought of Trump’s immigration policy”
Milton Friedman, a staunch advocate of free-market economics and limited government, might have had mixed reactions to Donald Trump’s immigration policies, depending on which aspects are considered.
Friedman often argued that immigration, especially of low-skilled workers, is beneficial in a free-market system because it increases labor supply and economic efficiency…
etc.
Paul Marks @2:20pm, “…Chat GPT…has a built-in leftist bias.”
Well then, forcing it to read the Samizdata archives will bring me satisfaction if it is in fact intelligent! 🙂
TBH I think the places to watch are not really ChatGPT, I think it is losing ground fast. The interesting sectors are in xAI, Musk’s company which currently has the largest computer cluster in the world and is expanding it even larger, and also a growing amount of experimentation with models at home and in focused sectors. Also there are a LOT of really interesting more application specific AIs coming out that aren’t all LLMs but use the same underlying technology.
BTW, with regards to Musk creating a cluster with one million NVIDIA nodes, this is quite insane, and something only he could do. His current cluster is 100k nodes, and the amazing thing is that they built it in about half a year. This is the sort of engineering miracle that has become almost perfunctory under Musk. It draws so much power that they run the electricity through Tesla battery packs to smooth it out because tiny variations in the electrical grid can damage the cluster. As to this one million node cluster he is building. By my back of the envelope calculations that single cluster may well draw more electricity than the whole city of Memphis that it sits next door to.
Musk got a late start, but I think he is the guy to watch (isn’t he always.) And I feel a lot better with him in charge than Sam Altman, that is for sure.
Fraser Orr @4:31pm, “[Musk]…he is the guy to watch…”
And what if he and Vivek can de-regulate the nuclear industry such that the rate limiting step in building a nuclear power plant is NOT the government approvals or oversight!
How fast could Musk build the GWs he needs for his AI? I’ll bet he’d beat the ChiComm records! One of many areas where Musk’s job in DOGE is a clear confict of interest with his corporate ambitions…I wonder if the gits in DC will use that angle to go after him?
However long it will take under current regimes to get the nukes we need…turn Musk loose and we’ll have them in half the time!
Will he build safe nukes? OF COURSE! How many of his exploding rockets have harmed anyone? ZERO.
This is what ChatGPT came up with based on the prompt in the original post:
Awesome, Alex! Thank you!
I guess I don’t have to learn how to run a ChatGPT query after all…although I still might. On my bucket list.
A key question: what parts of Samizdata.net did ChatGPT have access to in formulating these statements.
To all here, what do you think of the result? Did the bots nail your views? What’s missing?
I’m going to analyze this more when I have more time, e.g., make a score card on which of these Trump has already expressed some sort of support for, those he’s opposed to, etc.
And, Alex, if you don’t mind, I’ll probably email it to Trump, Musk and Vivek. They probably have an AI reading their mail! Ain’t that fun to think about!?
It’s easy. I find having an account gets you better answers (I wasn’t able to get it to answer the original prompt while logged out, but when I logged in it answered fine – I think it’s because I put “samizdata.net” in the prompt and the anonymous version cannot use the internet to answer queries.
https://openai.com/chatgpt
Unfortunately it’s rather difficult to know. OpenAI don’t publish the corpus of text used to train GPT. Fraser’s suggestions about training offline LLMs with material from Samizdata might produce better output than ChatGPT can produce. In the meantime, I’ll leave you with ChatGPT’s own answer about how much of Samizdata is in its training data:
Samizdata has been quoted in various papers, I’ve definitely seen references to it. I’m sure there’s a fair amount in the corpus, especially as the blog is publicly available.
I regard the output of GPT as public domain – it has been generated by AI from millions of tokens derived from other people’s work, I can’t see how it can be considered copyrightable material. I’m sure legally speaking the lawyers are already arguing about that, but certainly I don’t see the output from GPT as something I have any rights over. So please do feel free to email it to anyone – who knows, it may spark some ideas for someone.
I think it is worth pointing out that for the most part the description Alex posted above from ChatGPT is reasonably fair and lacking anti-right bias. For example:
Rollback on Environmental Overreach: Limit federal agencies like the EPA from imposing heavy-handed environmental rules that hurt businesses, emphasizing state and local decision-making on such matters.
This could easily have omitted “heavy-handed” or put in something about choosing economic activity at the expense of environmental protection, or said something about “state and local” meaning that we’d see a patchwork of different environmental regs, or said something about “undermining the changes made to slow down climate change — something that 97% of climate scientists agree is a critical problem”.
So I think this idea, which Paul mentions above, that these engines have a strong leftie or woke bias, isn’t really true, or at least isn’t consistently true.
Alex,
That ChatGPT answer was very USA focussed.
Yes, because that’s what the OP suggests:
NickM beat me to the punch – it is a very US centric list of suggestions. However, let’s be honest, if anyone stood on that ticket, we would be screaming their praises!
Yes, and with a bit of tweaking that could easily localized to… anywhere.
GregWA – ChatGPT is not an intelligence, not a free will, morally responsible, being, it has no soul (in the religious or non religious, Aristotelian, sense).
Someone mentioned immigration.
Odd that there was no mass immigration from Latin American before pubic services and government (taxpayer financed) benefits – in fact not odd at all, as this is what the mass immigration is about (that and plundering by vast criminal gangs, which have thousands of armed members).
Milton Friedman – his “Negative Income Tax”, the return of the Speenhamland system by another name, is exactly the sort of thing that would attract the worst people from all over the planet. Government paying able bodied people of working age money (for no work) is just about the worst welfare program imaginable – unless one is in the business of selling booze or drugs, or running a bawdy house or a gambling den, especially if one is talking about an “Open Borders” policy.
However, the late Professor Friedman was best known for his work on monetary economics – for which he got a “Nobel Prize” in 1976 – his work appears to be a repeat of the mistaken work of Irving Fisher (Yale) in the 1920s, the idea that increasing the money supply is fine if the “price level” remains “stable” according to some “price index” or other.
This work was discredited by the insane antics of Benjamin Strong, head of the New York Federal Reserve, in the late 1920s – the vast Credit Money bubble, defended by “the price level is not going up – prices are stable” (a classic example of missing-the-point).
Milton Friedman openly admired (the dreadful) Benjamin Strong – which is baffling, utterly baffling.
“You do not understand Paul – Milton Friedman needs to be compared to J.M. Keynes who was vastly worse”.
I agree – but this like saying that a person who shoots you once is better than someone who shoots you twice, I would rather not get shot at all.
People with memories that go back a few years will remember that paying able bodied people of working age money (for no work) was exactly what the Welfare Reform movement, which started with Governor Thompson in Wisconsin, was AGAINST – as it was horribly clear that this practice led to the creation of an underclass (expanding over the generations). Nor did it provide healthcare or even food for the poor, or the children of the poor – as the unearned money was spent on other things. Why anyone should have been in support of a return to the Speenhamland system (which started in the parish of Speenhamland in the late 1790s and was abolished in 1834) is hard to understand.
As for fiat money – whim money, money whose only value is the will-of-the-state enforced by Professor Krugman’s “men with guns” (his own term) – anyone who thinks that is a good idea is rather mistaken. As 2025 will show – rather graphically.
Money needs to be a Store of Value – not just a Medium of Exchange. And to be a good Store of Value money needs to be a commodity that is valued before-and-apart-from its use as money (historically gold or silver have served this role – but there may be other alternatives).