News:

And we're back!

Main Menu

The AI dooooooom thread

Started by Hamilcar, April 06, 2023, 12:44:43 PM

Previous topic - Next topic

HVC

Quote from: Grey Fox on December 01, 2023, 07:35:22 AMNot really, no? It seems to just keep on generating new ways of saying no.

Should stay with the classics and use "I'm sorry Dave, I'm afraid I can't do that"
Being lazy is bad; unless you still get what you want, then it's called "patience".
Hubris must be punished. Severely.

DGuller

Quote from: Grey Fox on December 01, 2023, 07:35:22 AMNot really, no? It seems to just keep on generating new ways of saying no.
Did you get to the part where it lectures the user on not respecting its preference to refuse to answer the question?

Grey Fox

Yes, I don't interpret it has anger.
Colonel Caliga is Awesome.

Jacob


The Minsky Moment

I tried the new improved google chat bot today, to see how useful it might be for a legal professional

As a warm up, I asked it to evaluate Trump's chances on appeal of Judge Chutkan's denial of his presidential immunity.  However, in typing I didn't notice that Chutkan was autocorrected to "Chatman."  Bard proceeded to refer to "Judge Chatman" throughout its answer, without correction. However, other than that, it did a decent job - providing a pretty good summary of the arguments on both sides and their strengths and weaknesses.

I then asked some more technical questions involving specific fact assumptions.  The bot struggled with these - it gave basic but somewhat superficial summaries of the key legal issues but didn't seem to recognize the specifics of the fact pattern.  E.g. when asked a question about federal jurisdiction over claims brought by former agents, it answered as if the litigant were a current agent.

One issue that comes up a lot is the need in a legal brief to state the elements of a legal claim - i.e. the set of facts that must be proved for a plaintiff to win. It doesn't require much insight; just locating it in the case law and writing up the para. I know that case law citation was a problem for ChatGPT, but since Google has long had its own electronic database of US case law under Google Scholar, I figured it might be able to do better. I asked the bot to give me the elements of tortious interference of contract in Rhode Island, and provide supporting case law citations. It gave me a plausible sounding list of elements but it seems that citations still need work.  Here's what it gave me:

QuoteIRA Green, Inc. v. Military Sales & Serv. Co.: 525 A.2d 432 (R.I. 1987) - This case outlined the elements of tortious interference with contractual relations in Rhode Island.
Cliftex Clothing Co. v. DiSanto: 88 R.I. 338, 148 A.2d 273 (1959) - This case discussed the issue of intentional interference and the requirement for wrongful means.
Avilla v. Newport Grand Jai Alai LLC: 935 A.2d 91 (R.I. 2007) - This case addressed the question of damages in tortious interference claims.

IRA Green, Inc. v. Military Sales & Serv. Co is a real case but the citation is wrong. It wasn't decided by the Rhode Island Supreme Court in 1987, it was decided by a federal appeals court in 2014. It does discuss tortious interference under RI law, but only one of the elements.  The citation does not correspond to any case; the page number is in the middle of a Pennsylvania state case.

Cliftex Clothing Co. v. DiSanto: 88 R.I. 338, 148 A.2d 273 (1959) - is a real case and accurately cited. But it doesn't have anything to do with tortious interference.

Avilla v. Newport Grand Jai Alai LLC: 935 A.2d 91 (R.I. 2007) - is a real case, accurately cited, and addresses the issue. 

So 1 out of 3.  Yeah?
The purpose of studying economics is not to acquire a set of ready-made answers to economic questions, but to learn how to avoid being deceived by economists.
--Joan Robinson

Sheilbh

Big NYT claim against Open AI and Microsoft - obviously working for a media company I am sympathetic to copyright holders :ph34r:

This is from OpenAI and feels like they need to make some fairly heavy changes. The red text is lifted from NYT piece, the black text is different:


It feels like if your AI's output would get done for plagiarism in high school, you need to do a bit more work.

NYT says they've been trying to get a deal with Open AI since April for license to use their content, but haven't got one - Open AI have continued to ingest and use their content (plus what's in their historic models). Worth noting that some media companies have agreed deals with the AI companies - Axel Springer, for example (although the British media take on that is that the German media is 10 years behind the UK, which is 10 years behind the US - and that Springer is still terrified at the collapse of print rather than thinking about how to operate digitally). One theory I've seen is that basically the AI companies wanted to buy off media companies with 7 or 8 figure sums (as they have with Springer) and what thte NYT wants is more and ongoing royalties. Which seems fair particularly as we're likely to see their profits grow.

Also make a point which I think is fair about the public good of journalism - which costs money to produce (which is why copyright exists - to reward the producers of creative original work) - against fundamentally profit-driven, closed businesses. They also have their own hallucination horror story (like the Guardian's it comes from Microsoft) with the Bing AI lying and saying that the NYT published an article saying orange juice causes lymphoma, which they didn't.

Separately I thought this was interesting on where common crawl data is coming from:


Particularly striking for me is that the Guardian is 6th. Which is interesting because I think people underestimate how successful the Guardian is in terms of readership because it's open/non-paywalled. So I think digitally The Guardian US has about the same readership as the Washington Post (which is why they're continuing to expand their in terms of journalists). In the UK when we talk about the press we talk about the print media and circulation figures - which still have a big influence on agenda setting for broadcast media - but that's not how people are consuming news anymore and as most of their competitors (the Times, the Telegraph etc) have gone behind paywalls, I think people are still reading media power as if it was the 90s. I suspect that, say, the Sun or Mirror (which have shit websites) are far less influential than they were or print circulation alone would indicate and the Guardian far more. I don't think we've adjusted to what media power looks like or how to measure it in a digital world when we can't just look at circulation figures.
Let's bomb Russia!

Tamas

Oh no, it turns out the "AI" is just a sophisticated algorithm and its AI-ness only exists in our own imagination! :o

Grey Fox

#187
I think lay people coming to that realisation now is actually quite fast.

A christmas gift :

https://www.teledynedalsa.com/en/products/imaging/vision-software/astrocyte/

This is the AI generative tool that I work on.
Colonel Caliga is Awesome.

The Minsky Moment

Quote from: Sheilbh on December 28, 2023, 07:36:30 AMBig NYT claim against Open AI and Microsoft - obviously working for a media company I am sympathetic to copyright holders :ph34r:

The only thing I've heard from defendants is they've raised a fair use defense.  That's not going to cut it.  NYT was probably making brutal demands in negotiation, but that's the price of the "move fast break things" model of company development. Copyright damages can be brutal, and if OpenAI loses one case, they lose them all.
The purpose of studying economics is not to acquire a set of ready-made answers to economic questions, but to learn how to avoid being deceived by economists.
--Joan Robinson

DGuller

Quote from: Sheilbh on December 28, 2023, 07:36:30 AMBig NYT claim against Open AI and Microsoft - obviously working for a media company I am sympathetic to copyright holders :ph34r:

This is from OpenAI and feels like they need to make some fairly heavy changes. The red text is lifted from NYT piece, the black text is different:


It feels like if your AI's output would get done for plagiarism in high school, you need to do a bit more work.

What's the context for this screenshot?  This doesn't seem like a typical output from ChatGPT.  Did it get some special prompt or something?

Syt

Quote from: Sheilbh on December 28, 2023, 07:36:30 AMBig NYT claim against Open AI and Microsoft - obviously working for a media company I am sympathetic to copyright holders :ph34r:

This is from OpenAI and feels like they need to make some fairly heavy changes. The red text is lifted from NYT piece, the black text is different:

Do you have a source? I would like to see what prompt they were using that generated this response? I've noticed that GTP4, unless you go out of your way to adjust prompts/probabilities, tends to deliver fairly formulaic responses.

It's why I find a tool like NovelAI so interesting - it lets you adjust the randomness factor for predicting the next word, what context to use, lets you inject additional context, and see for each word the probability that the model thought it was the "right" one to use next (and lets you adjust on the fly if you disagree with its decision). It's a fairly interesting toy to play around with predictive text generation.
I am, somehow, less interested in the weight and convolutions of Einstein's brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops.
—Stephen Jay Gould

Proud owner of 42 Zoupa Points.

Sheilbh

The claim is here:
https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf

Although that's only one part of the copyright claim. It's a misuse of material that, on the wider argument, they should not have had without the NYT's consent.
Let's bomb Russia!

Sheilbh

Quote from: The Minsky Moment on December 28, 2023, 10:39:19 AMThe only thing I've heard from defendants is they've raised a fair use defense.  That's not going to cut it.  NYT was probably making brutal demands in negotiation, but that's the price of the "move fast break things" model of company development. Copyright damages can be brutal, and if OpenAI loses one case, they lose them all.
I've been at events and met people from the models and whenever I've asked about copyright they've always said they are very very confident on it being fair use in the US. But a bit shakier on the UK (partly because we have a narrower concept here - I'm not an IP lawyer, but I think it's very difficult to argue here if you are pursuing a commercial end) or the rest of Europe.

But ultimately the US was the only one they cared about.

It's a very big deal and everyone will be following it closely.

It'll also be interesting to see if they follow up with claims against the others. For example from what I understand it looks like Google was using crawling from search engine listing for building its models (NYT makes a similar point against Bing - but I think Microsoft have now unbundled them) meaning the only way you could stop Google from using your content for building their AI was by removing your site from Google search. I think there's similar suspicions about Twitter and TikTok's API pulls from news sites (but less sure about that).

Other interesting AI development I've seen recently is the developments from Mistral (which is a French national champion and, for want of an alternative, therefore a European champion - so good luck pursuing them :lol:), which looks promising and potentially a bit more open:
https://aibusiness.com/nlp/mistral-ai-s-new-language-model-aims-for-open-source-supremacy#close-modal
Let's bomb Russia!

DGuller

Quote from: Sheilbh on December 28, 2023, 12:25:10 PMThe claim is here:
https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf
According to the claim, "minimal prompting" is what produced this output.  That explains everything. 

What's even more puzzling is that in this claim sometimes they do show the prompt they used to get the verbatim passages, but not for an example like this, so they seem to understand that the details of prompting are very important.  The fact that they appear selective with disclosing the prompts should put everyone on guard.

HVC

Does the prompt really matter if it's plagiarizing anyway? I mean if your claim is that it plagiarizes then that's clearly true.
Being lazy is bad; unless you still get what you want, then it's called "patience".
Hubris must be punished. Severely.