> And these are the hand picked examples. This product seems like it needs some ...

volta83 · on July 2, 2021

So how do you know if the code that Copilot regurgitates is almost a 1:1 verbatim copy of some GPL'ed code or not ?

Because if you don't realize this, you might be introducing GPL'ed code into your propiertary code base, and that might end up forcing you to distribute all of the other code in that code base as GPL'ed code as well.

Like, I get that Copilot is really cool, and that software engineers like to use the latest and bestest, but even if the code produced by Copilot is "functionally" correct, it might still be a catastrophic error to use it in your code base due to licenses.

This issue looks solvable. Train 2 copilots, one using only BSD-like licensed software, and one using also GPL'ed code, and let users choose, and/or warn when the snippet has been "heavily inspired" by GPL'ed code.

Or maybe just train an adversarial neural network to detect GPL'ed code, and use it to warn on snippets, or...

the_rectifier · on July 2, 2021

You have the same issue with MIT because it requires attribution

didibus · on July 2, 2021

Doesn't this go beyond license and into copyright?

The license lets you modify the program, but the copyright still enforces that you can't copy/past code from it to your own project no?

guhayun · on July 2, 2021

The solution might be simpler than we think,just tell the algorithm

slver · on July 2, 2021

It's very easy: don't use copilot code verbatim, and you won't have GPL code verbatim.

volta83 · on July 2, 2021

> It's very easy: don't use copilot

Fixed that for you.

Verbatim isn't the problem / solution. If you take a GPL'ed library and rename all symbols and variables, the output is still a GPL'ed library.

Just seeing the output of GPL'ed code spitted by copilot and writing different code "inspired" by it can result in GPL'ed code. That's why "clean room"s exist.

Copilot is going to make for a very interesting to follow law case, because probably until somebody sues, and courts decide, nobody will have a definitive answer of whether it is safe to use or not.

throw_2021-07 · on July 2, 2021

Stack Overflow content is licensed under CC-BY-SA. Terms [1]:

* Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

* ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.

In over a decade of software engineering, I've seen many reuses of Stack Overflow content, occasionally with links to underlying answers. All Stack Overflow content use I've seen would clearly fail the legal terms set out by the license.

I suspect Copilot usage will similarly fail a stringent interpretation of underlying licenses, and will similarly face essentially no enforcement.

[1] https://creativecommons.org/licenses/by-sa/4.0/

mkr-hn · on July 2, 2021

Have you met programmers? Even those who care about quality are often under a lot of pressure to produce. Things slip through. Before, it was verbatim copies from Stack Overflow. Now it'll be using Copilot code as-is.

slver · on July 2, 2021

So, nothing new, is your point?

mkr-hn · on July 2, 2021

Then why are you complaining? Unless something is new that warrants you getting mad about people getting mad at technology.

saiojd · on July 2, 2021

Not the parent, but people really like to get riled up on the same topics, over and over again, which quickly monopolizes and derails all conversion. Facebook bad, UIs suck, etc. We can now add to the list, "AI will never reduce demand for software engineering".

slver · on July 3, 2021

Well, "never" is a long time.

Copilot is definitely no replacement for anything except copying from Stack Overflow for juniors.

But in the long run, AI is us basically us creating our own replacement. As a species. We don't realize it yet. It'll be really funny in retrospective. Too bad I probably won't be alive to see it.

pydry · on July 2, 2021

It's true I probably wouldnt have laughed quite as loudly if there werent a chorus of smug economists telling us that tools like this are gonna put me out of a job.

slver · on July 2, 2021

Business types hate dealing with programmers, that's a fact. And these claims of "we'll replace programmers" happen with certain precise regularity.

Ruby on Rails was advertised as so simple, startup founders who can't program were making their entire products in it in a few days, with zero experience. As if.

astrange · on July 3, 2021

Economists don't believe this. It's non-economists who do. Economists know that it's not possible to run out of jobs because demand is infinite.

j-pb · on July 2, 2021

If I want random garbage in my codebase that I have to fix anyways I might as well hire a underpaid intern/junior.

It's easier to write correct code than to fix buggy code. For the former you have to understand the problem, for the latter you have to understand the problem, and a slightly off interpretation of it.

Supermancho · on July 2, 2021

> Everyone's self-preservation instincts kicking in to attack Copilot is kinda amusing to watch

Nobody is threatened by this, assuredly. As with IDEs giving us autocomplete, duplication detection, etc this can only be helpful. There is an infinite amount of code to write for the foreseeable future, so it would be great if copilot had more utility.

tyingq · on July 2, 2021

>As a side note, Excel also uses floats for currency

It's still problematic, but the defaults and handling there avoid some issues. So, for example:

Excel: =1.03-.42 produces 0.61, by default, even if you expand out the digits very far.

Python: 1.03-.42 produces 0.6100000000000001, by default.

slver · on July 2, 2021

Excel rounds doubles to 15 digits for display and comparison. The exact precision of doubles is something like 15.6 digits, those remaining 0.6 digits causing some of those examples floating (heh) around.

okl · on July 2, 2021

That depends https://randomascii.wordpress.com/2012/03/08/float-precision...

slver · on July 2, 2021

A lot of these edge cases are about theoretical concerns like "how many digits we need in decimal to represent an exact IEEE binary float".

In practice a double is 15.6 digits precise, which Excel rounds to 15 to eliminate some weirdness.

In their documentation they do cite their number type as 15 digit precision type. Ergo that's the semantic they've settled on.

tyingq · on July 2, 2021

"self-preservation"

My suggestion was a way to comment or flag, not to kill the product. These were particularly notable to me because someone hand-picked these 4 to be the front page examples of what a good product it was.

saiojd · on July 2, 2021

I agree with you. This is basically similar to autocomplete on cellphone keyboard (useful because typing is hard on cellphone), but for programming (useful because what we type tends to involve more memorization than prose).