This looks like the Blinkist app's next evolution. Most of the summary tools you guys mentioned like notebookLLM, Scirate etc. cover general non-domain expertise and rely on simple RAG/knowledge bases.
Academia definitely needs a tool that can parse complex equations, cross-links etc.
I was using notebookLLM today with FAA aviation charts loaded and the tool still hallucinates and does not parse visual data (maps, charts) well. I can imagine that in the world of ArXiv papers similar level of complex charts and visualizations would not be processed properly
Nice idea - would love to have some kind of daily mix with all the papers from my field / some way to prioritise them automatically based on the most important ones I should have read
What do you think would be a good way to prioritise papers? It seems to be especially difficult when the papers are not yet rated by the users. We were thinking of some algorithm that would analyse the authors of the paper and their previous track record, but it feels somewhat unfair towards the new/young academics?
hmm that’s true - I think if you could get a measure of interest for the paper like views on arxiv or number of mentions online, that could be a good metric to use.
it’s also possible that the LLM will be able to determine well which results are important vs which are minor improvements / changes that are not important for me to listen to.
the economist does a „world in brief” news summary daily, and having that for the papers that are relevant to me would be great!
Thanks. I think it is easy with older papers, since then there already a lot engagement, but it can more tricky with new submissions. How would you deal with new submissions? For myself I am quite often looking through papers in the morning that were published the day before.
I like the idea of the "world in brief" - I will try to add it.
Not undermining this effort, but one could always use notebookLM for that.
Just paste arxiv pdf link in notebookLM and it generates a very good podcast that is also customizable through prompt.
I tried to generate podcasts with notebookLM, but it felt like too much show for my liking. I wanted to make something more factual and dry to make sure that the content is not lost in the cloud. Of course one can personalise it, but these extra few steps that are annoying and were preventing me from using it in the bus on the way to my lab. Also having it more as a community thing would allow us to add features that people are interested in. Thoughts?
Ah cool! Great to see accessibility stuff like this. Listening to papers makes it much easier for me to focus on the content.
I made my own little service that converts any webpage to hopefully the parsed content then uses Google TTS and then published it to a bucket and s3 feed and I listen to them on my phone before bed.
Awesome to see other people having the same need as us.
Do you see anything that we could add to the tool to make it more useful for you?
One thing we played around with, which works quite well is directly interfacing it with GPT-realtime. This then allows one to talk to it about the paper. It also solves the problem of the language since any person can talk to it in their own language, which could increase accessibility in science. I have shown it to some Japanese colleagues the other day and they could interact with it in Japanese which was quite amazing.
Oh, that is nice idea! Do you think it would be more interesting for the community to reach out to scirate and integrate the podcasts there, or would it more interesting to try to scrape the scores from the scirate and integrate it to ekoAcademic?
The guy who runs scirate is a fellow quantum computing researcher (and all around friendly... fellow, lol). I can't speak for them, but I'm pretty sure they'd, at least, hear you out. The problem is more getting through the signal-to-noise of an academic's email:
This looks like the Blinkist app's next evolution. Most of the summary tools you guys mentioned like notebookLLM, Scirate etc. cover general non-domain expertise and rely on simple RAG/knowledge bases.
Academia definitely needs a tool that can parse complex equations, cross-links etc.
I was using notebookLLM today with FAA aviation charts loaded and the tool still hallucinates and does not parse visual data (maps, charts) well. I can imagine that in the world of ArXiv papers similar level of complex charts and visualizations would not be processed properly
Nice idea - would love to have some kind of daily mix with all the papers from my field / some way to prioritise them automatically based on the most important ones I should have read
What do you think would be a good way to prioritise papers? It seems to be especially difficult when the papers are not yet rated by the users. We were thinking of some algorithm that would analyse the authors of the paper and their previous track record, but it feels somewhat unfair towards the new/young academics?
Super curious about your thoughts.
hmm that’s true - I think if you could get a measure of interest for the paper like views on arxiv or number of mentions online, that could be a good metric to use.
it’s also possible that the LLM will be able to determine well which results are important vs which are minor improvements / changes that are not important for me to listen to.
the economist does a „world in brief” news summary daily, and having that for the papers that are relevant to me would be great!
Thanks. I think it is easy with older papers, since then there already a lot engagement, but it can more tricky with new submissions. How would you deal with new submissions? For myself I am quite often looking through papers in the morning that were published the day before.
I like the idea of the "world in brief" - I will try to add it.
Not undermining this effort, but one could always use notebookLM for that. Just paste arxiv pdf link in notebookLM and it generates a very good podcast that is also customizable through prompt.
I tried to generate podcasts with notebookLM, but it felt like too much show for my liking. I wanted to make something more factual and dry to make sure that the content is not lost in the cloud. Of course one can personalise it, but these extra few steps that are annoying and were preventing me from using it in the bus on the way to my lab. Also having it more as a community thing would allow us to add features that people are interested in. Thoughts?
Ah cool! Great to see accessibility stuff like this. Listening to papers makes it much easier for me to focus on the content.
I made my own little service that converts any webpage to hopefully the parsed content then uses Google TTS and then published it to a bucket and s3 feed and I listen to them on my phone before bed.
Awesome to see other people having the same need as us.
Do you see anything that we could add to the tool to make it more useful for you?
One thing we played around with, which works quite well is directly interfacing it with GPT-realtime. This then allows one to talk to it about the paper. It also solves the problem of the language since any person can talk to it in their own language, which could increase accessibility in science. I have shown it to some Japanese colleagues the other day and they could interact with it in Japanese which was quite amazing.
Sounds p cool
Integrate with scirate for that good good:
But seriously, I don't of another place that centers academics up-voting papers without... well... actually citing them.Oh, that is nice idea! Do you think it would be more interesting for the community to reach out to scirate and integrate the podcasts there, or would it more interesting to try to scrape the scores from the scirate and integrate it to ekoAcademic?
The guy who runs scirate is a fellow quantum computing researcher (and all around friendly... fellow, lol). I can't speak for them, but I'm pretty sure they'd, at least, hear you out. The problem is more getting through the signal-to-noise of an academic's email:
https://kunalmarwaha.com/about
Thank you for the contact. I will send him an email.
Do you have any thoughts on @joshny's comment? It is something we don't know exactly what would be the best strategy to deal with it.