The talk focuses for a bit on having pure data from before the given date. But it doesn't consider that the data available from before that time may be subject to strong selection bias, based on what's interesting to people doing scholarship or archival work after that date. E.g. have we disproportionately digitized the notes/letters/journals of figures whose ideas have gained traction after their death?
The article makes a comparison to financial backtesting. If you form a dataset of historical prices of stocks which are _currently_ in the S&P500, even if you only use price data before time t, models trained against your data will expect that prices go up and companies never die, because they've only seen the price history of successful firms.
Not a financial person by any means, but doesn't the Black Swan Theory basically disproves such methods due to rarity of an event that might have huge impact without something to predict (in the past) that it might happen, or even if it can be predicted - the impact cannot?
For example: Chernobyl, COVID, 2008 financial crisis and even 9/11
I've been wanting to do this on historical court records - building upon the existing cases, one by one, using llms as the "Judge". It'd be interesting to see which cases branch off from the established precedent, and how that cascades into the present.
I like the idea of using vintage LLMs to study explicit and implicit bias. e.g. text before mid-19th century believing in racial superiority, gender discrimination, imperial authority or slavery. Comparing that to text since then. I'm sure there are more ideas when you use temporal constraints on training data.
You're right: I wish OpenAI could find a way to "donate" GPT-2 or GPT-3 to the CHM, or some open archive.
I feel like that generation of models was around the point where we were getting pleasantly surprised by the behaviors of models. (I think people were having fun translating things into sonnets back then?)
I love the ideas about how we might use historical LLMs to inquire into the past!
I imagine that (the author hints at this), to do this rigorously, spelling out assumptions etc, you’d have to build off theoretical frameworks used to inductively synthesize/qualify interviews and texts, currently around in history and the social sciences.
The talk focuses for a bit on having pure data from before the given date. But it doesn't consider that the data available from before that time may be subject to strong selection bias, based on what's interesting to people doing scholarship or archival work after that date. E.g. have we disproportionately digitized the notes/letters/journals of figures whose ideas have gained traction after their death?
The article makes a comparison to financial backtesting. If you form a dataset of historical prices of stocks which are _currently_ in the S&P500, even if you only use price data before time t, models trained against your data will expect that prices go up and companies never die, because they've only seen the price history of successful firms.
The talk explicitly addresses this exact issue.
It mentions that problem in the first section
Not a financial person by any means, but doesn't the Black Swan Theory basically disproves such methods due to rarity of an event that might have huge impact without something to predict (in the past) that it might happen, or even if it can be predicted - the impact cannot?
For example: Chernobyl, COVID, 2008 financial crisis and even 9/11
Someone has sort of done this:
https://www.reddit.com/r/LocalLLaMA/comments/1mvnmjo/my_llm_...
I doubt a better one would cost $200,000,000.
I've been wanting to do this on historical court records - building upon the existing cases, one by one, using llms as the "Judge". It'd be interesting to see which cases branch off from the established precedent, and how that cascades into the present.
Any thoughts how one could get started with this?
I like the idea of using vintage LLMs to study explicit and implicit bias. e.g. text before mid-19th century believing in racial superiority, gender discrimination, imperial authority or slavery. Comparing that to text since then. I'm sure there are more ideas when you use temporal constraints on training data.
I was hoping that this would be about Llama 1 and comparison with GPT-contaminated models.
Over the long term LLMs are going to become very interesting snapshots of history. Imagine prompting an LLM from 2025 in 2125.
You're right: I wish OpenAI could find a way to "donate" GPT-2 or GPT-3 to the CHM, or some open archive.
I feel like that generation of models was around the point where we were getting pleasantly surprised by the behaviors of models. (I think people were having fun translating things into sonnets back then?)
I would probably prefer wikipedia snapshots (including debate) as a future historian.
Maybe in the sense that a CueCat is interesting to us today.
I love the ideas about how we might use historical LLMs to inquire into the past!
I imagine that (the author hints at this), to do this rigorously, spelling out assumptions etc, you’d have to build off theoretical frameworks used to inductively synthesize/qualify interviews and texts, currently around in history and the social sciences.
Very cool! I’ve been wanting to do this do a long time!