Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: DataJoy, an online Python and R editor for scientists (getdatajoy.com)
17 points by jpallen on July 7, 2015 | hide | past | favorite | 8 comments


DataJoy is our second product (we're also the team behind ShareLaTeX (www.sharelatex.com)). We realised that a lot of the technical problems we had solved to make ShareLaTeX possible could also apply to more general environments, like Python and R which are also very common in academia. We're hoping to make these tools easier for scientists to use, both when getting started and when collaborating on their data analysis, data clean up, statistics, etc. Let us know what you think!


ShareLaTeX is a fantastic product, thank you for making it. It's very much a Google Doc for LaTeX. I used it very extensively when I was in university. Given how many auxiliary files are generated/needed when compiling LaTeX, keeping those files in the cloud was very painless.

I think an interesting avenue to explore with that product would be more templates/easy-to-start documents/quick tutorials.

LaTeX is the standard in some parts of academia (math, CS, stats) and not others (my brother, a biochemistry student at a public research university with 30k+ students, say nobody has even heard of it there) - I think the amount of market that is still using Word for academic papers (often because they do not know of LaTeX or think they don't have time to learn it) is very significant. ShareLaTeX is definitely a step in making LaTeX more accessible for that segment but there's definitely a long road ahead too.


Hey, I just signed up and I'm a bit unclear on how the "reproducible" part works. If I download my files they don't contain e.g. a requirements.txt?


This is a very good point, that we haven't worked out a good solution to yet (it's a notoriously hard problem). A requirements.txt file would certainly help, but then that's the sort of thing you'd probably need to maintain yourself? Fully reproducible environments that can be run offline is a stretch goal of ours for the future though.

The 'reproducible' part that we mentioned was meant to refer to the fact that it's easily reproducible on DataJoy. I.e. if you share you work with a collaborator, you're not going to have any headaches around getting the scripts working on their system, installing the right packages again, etc.


Are there any future plans to incorporate ipython notebooks?

If you could run ipython notebooks as a service, I would think you would get a significant number of users.


Nice idea. How does this compare to someline like Cloud 9?

Minor interface note: I think the Run Line / Run Script buttons look a little out of place and too big.


Thanks. DataJoy is aimed at a totally different market from Cloud 9. We want to make it easy for scientists and people doing data analysis, statistics, etc rather that developing software packages. The difference is that this requires a much more interactive environment where the code you're running is constantly evolving and being iterated on (if you've ever generated a plot, you'll know how many times it takes to get perfect). So we're aiming to work well for that use case.


You are definitely correct when it comes to non-software developers needing tools to do data analysis that allow for a lot more iteration and ad hoc exploration.

How do you see this competing with / differentiating itself from RStudio? The python support?

In the statistics department I was recently at, R was the lingua franca and Rstudio was how we wrote it. I have not found a better product yet for ad hoc exploration than Rstudio but will certainly give this a go.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: