Can I ask for help/advice?
I really want a @Roam x @Mathematica tool for making explicit, data-filled knowledge graphs of new fields. What’s a good way to hack a pipeline until someone builds this?
Context:
I often want to understand new fields. To do this, I need to both build an idea graph and compile evidence for nodes in the graph.
How I learn new fields:
- Build an idea graph (Roam, Google Sheets)
- Capture key papers + databases for nodes on the graph (local folders, Google Drive)
- Extract information from key papers + databases (Google Sheets, export to pandas or other programming)
- Graph extracted information (Prism, Matplotlib, Google Sheets charts, Jupyter or Mathematica notebook)
Problem:
It’s hard to link idea graph and primary evidence in a way that
- Renders the primary source easily accessible
- Allows for fluid data manipulation
- Allows for easy data visualization
- (Wishlist) propagates uncertainty in one concept to dependent nodes (make Roam an actual PGM, or make PGMs more interpretable and editable)
There a few reasons for this
- PDFs are a terrible, noisy filter for the raw data (see below)
- Solutions exist for 2+3 above (for example, Mathematica or Jupiter notebooks) but PDFs are such a terrible core source of information that it’s hard to easily add them to this stack
- Science is nuanced and any time you put things in spreadsheets it loses some important context. Papers capture some of this in an unstructured way.
My core issue with reason #3 is that you should be able to then add structure to your idea graph, and incorporate the primary evidence into the updated graph with notes on your uncertainties. This to me is the recursive cycle at the heart of this process. I think a more fluid tool would allow for many more layers of iteration here.
Appendix: Why I hate papers so much as a way to get scientific info
How science works:
- Collect data, put in spreadsheet (scientist)
- Make JPEG with spreadsheet (scientist))
- Put JPEG in PDF (scientist))
- Extract JPEG from PDF (you)
- Extract data points from JPEG (you)
- Put data points in spreadsheet (you)
The paper PDF is literally a noisy, compressed filter on the info you want