Hacker Newsnew | past | comments | ask | show | jobs | submit | Aiedail's commentslogin

This is commendable, but there's room for improvement. Up until now, SOTA-level "open-source" LLM models (LLaMA, Mistral, etc.) have usually only made their inference code and model architecture public. While these elements are not insignificant, they are somewhat trivial when compared to the training code and training datasets, as these two factors largely determine the performance of the model. This is not open at all. It goes without saying that sharing the training datasets and process with other AI researchers is crucial. This transparency would not only help to improve the model(for others could contribute to it) but also contribute to the whole community, as they usually advertised. Otherwise, it will be difficult for these efforts to truly promote the development of LLM.


Is there a similar tool for python? I don't like the DAP solution but prefer a gdb like python debugger :(


Have you ever tried a bright environment and dark mode monitor? I'm working with this one about 3 years and it makes me both relax and comfortable.


Maybe it’s just that my monitors are too dark, but when I try this everything just looks too washed out and I can’t really tell what I’m looking at. Switching to light mode was more a side effect of the increased lighting than anything.


I'm curious about what makes this project special, see there's a lot of similar implementations of diffusion models based on pytorch/tf. Is it because it use the cpu itself to produce the diffusion process?


Yeah. For something like this, you ideally would want a powerful GPU with 12-24gb VRAM. If you have something like an RTX 2070 at the bare minimum, you probably don't need this and could do a lot more steps a lot faster on a GPU, but it's great for those who don't have that option.


A $500 RTX 3070 with 8GB of VRAM can generate 512x512 images with 50 steps in 7 seconds.


The RTX 2070 also shipped with 8GB of VRAM, just fyi.


Yep, 8GB works fine. The 2070 is where I started. I wouldn't consider it ideal, though. There will be cases where you'll wish you could increase the resolution a little more, or could do just a few more per batch, but you're getting CUDA out-of-memory errors


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: