You might want to be using o3 for this? Afaict "Deep Research" is just a wrapper around o3 which writes a giant chain of thought that's then summarized by another model into the fancy report you get.
Also uh be careful not to get goodharted, now that the people submitting essentially know every one of your criteria and that an LLM is doing first pass. I'm really not sure this is a good idea vs. finding and paying high taste humans to do it.
Oh yeah more on o3, as the DR report points out and as OAI has shone, o3 was heavily RLed in the "do internet searches" direction which is why e.g. it's great at rainbolting (geoguessing).
This is definitely not the case for Perplexity (which only offers scaffolding or a fine-tuned R1) or Opus (which ime is terrible at conducting good search).
Interesting! That should simplify grant making. Though, I think this project would be better published as a repository with the detailed description or configuration of the pipeline / graph / prompts, so that people can run it with their API keys, customizations. Publishing it as a separate site limits the usage, imho, and also those projects are public now.
Michaël Trazzi did a similar project last July at Apart's research augmentation hackathon: https://apartresearch.com/project/grant-application-simulator
Might be worthwhile notifying Michaël about it (just in case he's not aware) so he can chime in or spread it to others interested in that direction
You might want to be using o3 for this? Afaict "Deep Research" is just a wrapper around o3 which writes a giant chain of thought that's then summarized by another model into the fancy report you get.
(Here's a deep research report on deep research I made soon after DR came out: https://chatgpt.com/s/dr_685312b70aa48191964860333e0fee56), that's my source here, primary source is the spec card I think.
Also uh be careful not to get goodharted, now that the people submitting essentially know every one of your criteria and that an LLM is doing first pass. I'm really not sure this is a good idea vs. finding and paying high taste humans to do it.
Oh yeah more on o3, as the DR report points out and as OAI has shone, o3 was heavily RLed in the "do internet searches" direction which is why e.g. it's great at rainbolting (geoguessing).
This is definitely not the case for Perplexity (which only offers scaffolding or a fine-tuned R1) or Opus (which ime is terrible at conducting good search).
Interesting! That should simplify grant making. Though, I think this project would be better published as a repository with the detailed description or configuration of the pipeline / graph / prompts, so that people can run it with their API keys, customizations. Publishing it as a separate site limits the usage, imho, and also those projects are public now.