The terms against using the output to develop a competing product seem the same as reverse engineering to me.
Competitors can't get OpenAI's model weights but use its outputs to produce a functionally similar model.
It's like if you had a competitor's engine and couldn't open it up but could still see the outputs: torque, rpm, ..., and could control the inputs: fuel intake, air mixture, etc... Then you make an engine by inferring back from these measurements.
No, it will be an engine that produces the same output if given the same input. It's like saying Google reverse engineered Yahoo and built a search engine
I don’t think it counts as reverse engineering if you only treat it as a black box.
In addition, this is much more akin to data exfiltration than to copying an engineered mechanism. The training algorithm could count as an engineered mechanism, but that’s not what is being copied.
The engine analogy is not a great one. Those inputs and outputs are very crude information. The actual design of an engine is quite complicated and subject to very close tolerances.
Your analogy is like going to an airport and looking at departure and arrival times as well as the flight path and then “reverse engineering” an aircraft from that. The chances are very low that you produce anything remotely resembling the aircraft you’re trying to reverse engineer. Same goes for the engine.
In the case at hand we are dealing with a mathematical function. In->Out is all there is. Back-estimating a mathematical function by sampling is as close to reverse engineering as anything is.
Competitors can't get OpenAI's model weights but use its outputs to produce a functionally similar model.
It's like if you had a competitor's engine and couldn't open it up but could still see the outputs: torque, rpm, ..., and could control the inputs: fuel intake, air mixture, etc... Then you make an engine by inferring back from these measurements.
Wouldn't that be reverse engineering?