dpaleka's comments

dpaleka · on Feb 17, 2023

More search won't do good, but why wouldn't targeted training help? The way I see it is that the adversarial policy search discovers positions which are off-distribution for anything seen in the victim's self-play training.

But training on that particular sort of adversarial states should help against the human player which has learned the strategy, just like training on patch adversarial examples in vision helps against the same type of patches.

Of course if the adversarial policy is again allowed to find off-distribution states (by playing against the victim), it will certainly find ways to beat it, until the model is playing perfectly. (Emergent gradient obfuscation could also theoretically happen, but I don't know if it has been demonstrated to actually happen.)

ablob · on Feb 18, 2023

More targeted training won't do good, but why wouldn't more search help?

We've apparently entered the stage where the deciding factor between who wins, man or machine, is just an arms race.

dpaleka · on Feb 18, 2023

More targeted training won't do good, but why wouldn't more search help?

My understanding is that gwern above linked solid evidence in the paper for more search not being enough, as in, the model's evaluation NN is so way off target when searching, that realistic amounts of search don't help. Go seems to have many possible moves per position, so the search doesn't go very deep anyway.

Feel free to correct me if I'm wrong, it might be that I misremembered how AlphaGo-style systems work.

dpaleka · on Feb 17, 2023

That paper (ROME) was the most famous paper in the field last year :)

See also new interesting developments breaking the connection between "Locating" and "Editing":

https://arxiv.org/abs/2301.04213

Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models

dpaleka · on Jan 18, 2023

[I mean no bad faith in this comment, I'm a fan of yours.]

Why answer questions about harmlessness/safety in such a roundabout way? Both OpenAI and Anthropic are clear about what words like "safe" are intended to mean: a stepping stone to "AI does not kill all people when given control".

Avoiding to state this clearly only invites unnecessary culture war disagreements in every discussion about these models.

goodside · on Jan 18, 2023

Maybe you’re right. It’s partially laziness on my part — it takes a while to explain long-term issues, and those who are inclined to care about them are generally aware of who started Anthropic and why.

dpaleka · on Sept 24, 2021

A weird property of the described abstractions is that as you go tighter (interval -> zonotope -> polyhedra), the trained networks counterintuitively become less robust. Why does more precision in verification hurt training?

A recent work not mentioned in the last chapter "Adversarial Training with Abstraction" is [1], which kind of explains this issue using the notions of continuity and sensitivity of the abstractions.

[1]: https://arxiv.org/abs/2102.06700

dpaleka · on Aug 25, 2021

I too use large files with random notes, but I can't be bothered to write dates -- so I use git and cron to automate a searchable, persistent diary.

Let me write a blog post about it. The author of this article in particular might find it useful. Does anyone do something similar?

asimjalis · on Aug 25, 2021

I like this. Would love to see the details of how you do it. I too dislike adding dates to entries. My solution has been a vimscript snippet that inserts the current date. I have bound it to <C-l><C-d>

  fun! InsertDate()
    let l:line = getline('.')
    let l:date = strftime('%Y-%m-%d')
    call setline('.',strpart(l:line,0,col('.')).l:date.strpart(l:line,col('.')))
  endfun

  inoremap <C-l><C-d> <ESC>:call InsertDate()<CR>

dpaleka · on Aug 20, 2021

I feel the current marketing of topological data analysis and similar is directed towards people who have at least taken several graduate math courses.

I'm really not an expert in TDA, I only know Ghrist's book and have read Hatcher. Are there really important topology ideas in topological learning, or it can all be phrased combinatorially?

If it's an important research area, the current language really raises the bar to entry.

gspr · on Aug 20, 2021

> I feel the current marketing of topological data analysis and similar is directed towards people who have at least taken several graduate math courses.

Oh I don't know about that. Lots of TDA is accessible at an undergrad level, as for example the book by Ghrist that you mention shows. I may be biased as I did my PhD in the field, but I also supervised several master's students, and some came from engineering or science backgrounds with little more math than undergrad level under their belts. They mostly did well and learned the basics of the field enough to make contributions during their thesis work. I would say that the field spans a vast range of applied-ness (from straight up data analysis with persistent homology as merely a tool, to fully theoretical work on e.g. multi-persistence). The more applied areas of the field tends to be far more accessible to people without a deep math background.

> Are there really important topology ideas in topological learning, or it can all be phrased combinatorially?

I think topological machine learning is still in its (very exciting!) infancy, and there's nothing I'd like more than to be able to answer that question! I think there's lots of potential for an extremely interesting fusion of two disciplines. And I never forget something the above-mentioned Ghrist once told me (during a discussion unrelated to ML): Never underestimate the treasure-hunting way of doing research – there's lots of be had from bringing together ideas from seemingly unrelated fields, especially when combining old and young ones. (That's me paraphrasing from memory, obviously, Ghrist is a lot more eloquent than that!)

TDA is in some sense just that: the very old of algebraic topology meets the very new of computation, data analysis, etc. And I think the meeting of topology and machine learning has potential to be very exciting too!

PS: When you say "or can it all be phrased combinatorially?", I wanna point out that topology and combinatorics are not mutually exclusive. This is a nice book that straddles the realms of TDA, topology and combinatorics: https://www.springer.com/gp/book/9783540719618 . See also Matt Kahle's fabulous works on showing the probabilistic properties of certain randomly generated combinatorial topological spaces (in some sense generalizing the classical work of Erdős and Rényi pertaining to properties of graphs that you may know, https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_...).

dpaleka · on Aug 20, 2021

Thought that the article would be abou the quest for a better random graph model. It's weird that Quanta didn't talk about this yet.

For the other direction (ML helping graph theory and not vice versa), see Constructions in Combinatorics via Neural Networks [0] by Wagner (2021). I made an implementation to play with [1].

[0]: https://arxiv.org/abs/2104.14516

[1]: https://github.com/dpaleka/cross-entropy-for-combinatorics

dpaleka · on Aug 9, 2021

How do models trained with Lightly compare with other approaches wrt adversarial robustness?

Can using Lightly introduce additional bias in the model, since only a select few of inputs are being labeled? This may be a concern for publicity purposes.

By the way, I thought ETH spinoff requirements were incompatible with YC requirements - nice to see it can be made to work.

isusmelj · on Aug 9, 2021

Thanks for the interest and great questions. Responses are below:

>How do models trained with Lightly compare with other approaches wrt adversarial robustness?

We have no benchmark available. Both approaches can be combined. You can use Lightly to pick a diverse subset, label it and then during training/ evaluating the model check for adversarial robustness and re-iterate.

>Can using Lightly introduce additional bias in the model, since only a select few of inputs are being labeled? This may be a concern for publicity purposes.

If we remove bias we automatically introduce bias. BUT we want the introduced bias to be controlled and known.

Bias typically comes from the way we collect data. For example, more data is being collected during the day than during night for autonomous driving. We also have more data collected during sunny weather than rain or snow. We also have more data from cities like San Fransisco than cities like New Mexico. Most of our datasets are biased.

> By the way, I thought ETH spinoff requirements were incompatible with YC requirements - nice to see it can be made to work.

From what we know we are the first ETH spin-off who is part of the YC program. we hope they don't abandon us.

dpaleka · on Aug 1, 2021

No one is reaching out in Switzerland right now, but I guess they should.

dt3ft · on Aug 1, 2021

Switzerland has plenty of code written in vb/c# and tech like wpf is sought after. I highly recommend you look into that and do a few sideprojects using this “old, uncool” tech, if you want a job there. (Source: I lived and worked in CH for 6 years)

dpaleka · on July 14, 2021

Implementation where you can try to disprove conjectures yourself: https://github.com/dpaleka/cross-entropy-for-combinatorics