Jump to content

Talk:Clearview AI

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Sources for expansion[edit]

IAR by putting this as the top section, feel free to cross out or add to this list:

tedder (talk) 17:09, 12 March 2020 (UTC)[reply]

Findface[edit]

Is there a sourceable connection to FindFace and the NTechLab algorithm ? Seems oddly coincidental. Alexpl (talk) 10:05, 21 January 2020 (UTC)[reply]

COI[edit]

There is some serious conflict of interest or something going on- the current 'technology' section looks like it's straight out of the company's PR. Then there's additions like this, straight off the company's website and removing some neutrality in favor of what was on the `cv_consumer` reference from their website and restating their political connections in a more favorable way. I removed the worst of it but it needs discussion. tedder (talk) 23:36, 25 January 2020 (UTC)[reply]

All of Bibodidad's contributions up to now are to this article.WeyerStudentOfAgrippa (talk) 12:23, 26 January 2020 (UTC)[reply]
As are the WP:SPA IPs 98.180.170.111 (talk · contribs · WHOIS) and 2604:2000:1406:23B:8C75:BE17:2B93:612 (talk · contribs · WHOIS). tedder (talk) 22:20, 26 January 2020 (UTC)[reply]

Peacock tag[edit]

I removed the {{peacock}} tag from the page. Basically every word is cited, some specific examples would be helpful. tedder (talk) 20:22, 5 March 2020 (UTC)[reply]

Possible source[edit]

The Far-Right Helped Create The World's Most Powerful Facial Recognition Technology. Doug Weller talk 17:50, 15 April 2020 (UTC)[reply]

Already used it, from the dot-com version of huffpo. See the ref named "huff_Far", used in three places including "far-right clique". tedder (talk) 18:02, 15 April 2020 (UTC)[reply]

Request Edit - Proposed Addition of Article Citation[edit]

Information to be added to the article: Reference citation to recently published article "Some Observations on the Clearview AI Facial Recognition System- From Someone Who Has Actually Used It...." https://www.linkedin.com/pulse/some-observations-clearview-ai-facial-recognition-system-blatt/

Explanation of Issue: The Clearview AI article references citations (articles, comments) but none of the authors/cited sources have any hands on experience in actually using Clearview AI in terms of its end user functionality in the hands of a police officer/criminal investigator. The article that I recently published provides readers with exactly what is missing from the current Clearview AI article discussion and I believe adds valuable practical understanding of how Clearview is used by police. I am not aware of any other published article on Clearview that has been authored by an actual law enforcement end user of Clearview. The unique perspective of my article was supported by, for example, Professor Jonathan Zittrain (who is quoted in the Clearview AI article) on Twitter thanking me for the piece as very helpful to describe how the system works. https://twitter.com/zittrain/status/1250805605478076417?s=21 and https://twitter.com/zittrain/status/1250806281474097155?s=21

Location for the Proposed Citation Addition: A citation to my article may be either at the end of the Reception section referencing the piece or alternatively cited in the See Also section to provide readers with the option to understand the actual user functionality provided by Clearview to police officers.

Link: https://www.linkedin.com/pulse/some-observations-clearview-ai-facial-recognition-system-blatt/ --Techlawyer (talk) 04:15, 19 April 2020 (UTC)[reply]

How does this meet WP:VERIFY and WP:RS. It may have great merit, but I can't see that it was "reliably published". Doug Weller talk 09:47, 19 April 2020 (UTC)[reply]
Indeed- linkedin is not considered a reliable source, as it's all self-published (e.g., the same as a blog). Having a unique point of view doesn't override that. tedder (talk) 18:35, 19 April 2020 (UTC)[reply]
Declined due to lack of reliable sourcing. Mdaniels5757 (talk) 22:04, 27 April 2020 (UTC)[reply]

links/subjects for inclusion[edit]

in the lawsuit injunction update: "it is clear [the data they] unlawfully collected and possess are not safe or secure." tedder (talk) 19:19, 22 April 2020 (UTC)[reply]

Jessica Medeiros Garrison sources for expansion[edit]

These would work well in an article, but she doesn't have one. Putting here for future use.

  • Washingtonian, Washington’s Most Powerful Women 2021, Oct 13 2021: "Jessica Medeiros Garrison, vice president of government affairs at Clearview AI. The firm’s innovative facial-recognition technology frightens a lot of civil-libertarians. Garrison is involved in selling it."
  • Alabama Today, Mountain Brook resident Jessica Garrison named one of ‘Washington’s most powerful women’, Beth Cann, Dec 27 2021: born in Rhode Island, "After graduating from the University of Alabama in 1997, Garrison served as director of legislative affairs and public information in the office of Bill Pryor while he was Alabama Attorney General. She graduated from the University of Alabama School of Law in 2000."
  • [ The Homewood Star: Garrison named one of ‘Washington’s Most Powerful Women’: Mountain Brook resident attracts attention for role with tech firm Clearview AI, Jesse Chambers, Dec 31 2021]: "...a Mountain Brook resident since 2011 who lives in Crestline. ... “We are now recognized as the most accurate algorithm in the Western world,” Garrison said. ... “I had every intention of being a Bama cheerleader when I arrived on campus,” she said. “I had led my high school squad.” ... “The prolific preying upon and abusing children is something I don't think our society grasps, yet,” Garrison said. “There are so very many threats lying in wait for children — from infants to teens. The more I have learned, the more committed I am to using all weapons in my arsenal to fight back and to effect change. We can no longer ignore the problems. My work with Clearview AI certainly provides answers.”"

tedder (talk) 19:35, 1 January 2022 (UTC)[reply]

Several irrelevant sources[edit]

I just stumbled into this article, and was surprised to find several (like, almost 10) sources that seemed totally unrelated to the claims being made. I think someone needs to comb through the references in this article to ensure that's not happening elsewhere too. It seems to me there's a likely case of WP:OVERCITE, and possibly WP:REFBOMB in parts of this article. StereoFolic (talk) 05:11, 26 July 2023 (UTC)[reply]

Sheesh, you're right, StereoFolic! That's awful! I found lots of overciting, but only one example of what you highlighted. Thanks for noticing!--FeralOink (talk) 15:03, 28 August 2023 (UTC)[reply]

Editorial Comments[edit]

Some new sources have come out on this topic an the article is strongly biased in some areas so I am working on some improvement. Draft is temporarily here. I will fix my editorial remarks before adding info from new sources.

Several instances on nonencyclopedic style which I will correct.

I plan to consolidate the Far-Right connections into its own section rather than have them spread throughout.

I did all this and added a new source. I would like some feedback. Czarking0 (talk) 21:28, 20 September 2023 (UTC)[reply]

Hello Czarking0! You have done good work, e.g. removing a heavily over-wikilinked passage of marginally-related names/companies, and some consolidation in addition to what you referred above.
  1. Please be aware that Clearview AI settled with the ACLU in 2022, agreeing to offer its services to law enforcement and government agencies ONLY. Some of the changes I made a few weeks prior to yours were updates that informed of that. (I do consider it important to retain some history of the company's practices, before the ACLU settlement.) You reintroduced older content, e.g. in the lead, it now says Clearview offers its services to governments, law enforcement, and other actors. Has Clearview broken terms of the ACLU agreement recently? If so, refer me to the source, about who these "other actors" are?

Thanks for the feedback FeralOink. I said other actors because they are selling to schools which I think is technically not in violation of the ACLU agreement but does not really fall into government or law enforcement. That is sourced in the article already. Here is a link: https://www.reuters.com/technology/clearview-ais-facial-recognition-tool-coming-apps-schools-2022-05-24/ Czarking0 (talk) 18:22, 22 September 2023 (UTC)[reply]

  1. I have no problem with consolidating certain details in a "far-right" section, but the HuffPo source is dated, and isn't the best in terms of WP:RS. I'd like to replace it with something else if possible. I'll look around, to see if I can find anything more WP:WS.

The article had another source on this matter which was from AI now. It was published in blog form and I did not think it met WP:RS . I do not think Huffpost is the best. I am not a left-leaning person and I was tempted to remove all the references to Far-right activity since the sources does not seem to be great but I was worried that I was then implementing my own bias. After checking the source I do think the Huffpost journalist is correct that there is some connection between this company and the Far-right however it is hard to say in an unbiased and well sourced manner what that connection really is. If you want to add with better sources or remove the section entirely I will support either option. Czarking0 (talk) 18:22, 22 September 2023 (UTC)[reply]

  1. I'll continue looking through the revised and updated article. I'll note any concerns that warrant a response here, but I won't make any changes until you have had a chance to respond here. I'll just fix some minor formatting and get rid of extra/white spaces.--FeralOink (talk) 14:36, 22 September 2023 (UTC)[reply]
Czarking0, there are several instances of overcites, i.e. the same information / news story coverage by multiple sources are each referenced. That is inconsistent with WP:MOS (and adds nothing) so I'm going to clean those up, where possible. At most, I will retain two rather than, say, five! Also, I did some research on Clearview's customer base. I find no indication that Clearview is selling their surveillance services (and image database) to anyone other than law enforcement and government agencies. In other words, they are abiding by the terms of their settlement with the ACLU... HOWEVER, Clearview's L.E. and government agency customer base is NOT limited to the US, so I will make sure to indicate that and retain article coverage about it. I don't feel that foreign governments, e.g. Non-Zealand, should be considered "actors", so I will remove that word from lead.--FeralOink (talk) 15:43, 22 September 2023 (UTC)[reply]

As stated above actors was mearly meant to refer to schools so if you feel like saying schools in the title then I am on board. Otherwise I am not sure. Czarking0 (talk) 18:22, 22 September 2023 (UTC)[reply]

Czarking0, I made some minor changes. The customer list from the Buzzfeed data breach needs to be trimmed, particularly for companies that never used the application and don't have Wikipedia articles. I won't do any further edits until you have a chance to share your thoughts here.--FeralOink (talk) 17:23, 22 September 2023 (UTC)[reply]
You are clear to do further edits Czarking0 (talk) 16:14, 26 September 2023 (UTC)[reply]

 Done I have responded. You can continue with edits. I made a small change to the history section where part of the timeline was not in order. — Preceding unsigned comment added by Czarking0 (talkcontribs) 18:22, 22 September 2023 (UTC)[reply]

Best In The World[edit]

@Grayfell I would like you to restore the section on them being one of the best facial recognition algorithms in the world. The NIST study which I used as a source is reputable and unbiased. WP:PROMO does not apply because this information is the summary of a government funded comparative study. If you think "best" is too vague then I think it is fair to clarify what the metrics of the study are; however, the algorithm's performance is notable and the article is worse off with the reader unaware of how good their algorithm is compared to the state of the art.Czarking0 (talk) 21:11, 4 October 2023 (UTC)[reply]

If you cannot figure out how to summarize this neutrally it doesn't belong. Wikipedia isn't a platform for advertising. Grayfell (talk) 21:14, 4 October 2023 (UTC)[reply]
Actually, your cited source doesn't support this, anyway:
In a field of over 300 algorithms from over 200 facial recognition vendors, Clearview ranked among the top 10 in terms of accuracy, alongside NTechLab of Russia, Sensetime of China and other more established outfits. But the test that Clearview took reveals how accurate its algorithm is at correctly matching two different photos of the same person, not how accurate it is at finding a match for an unknown face in a database of 10 billion of them.[1] (emphasis added)
The article also says ...Oddly, Clearview submitted its algorithm for the former test, rather than the latter one, which is what its product is built to do.[2]
The source is very clearly skeptical of the company's grandiose PR claims. For us to pass along these claims without any context would be misrepresenting those sources, in addition to being over-promotional and misleading. Grayfell (talk) 21:20, 4 October 2023 (UTC)[reply]
Ok then I will just say "Clearview's algorithm is in the top 10 for accuracy of matching two faces of the same person." I am offended that you accuse me of passing on their grandiose PR claims. I have not read their PR claims and am adding this claim from that source. Czarking0 (talk) 23:56, 5 October 2023 (UTC)[reply]
Your addition highlighted one specific aspect of a source without including the context provided by that same source. All sources must be evaluated in context, and the significance of this one study should be included. Regardless of your intentions, placing this in the lead without context is promotional. Grayfell (talk) 01:01, 6 October 2023 (UTC)[reply]
I have adjusted the wording in the body to include this context, per the cited source. Grayfell (talk) 01:16, 6 October 2023 (UTC)[reply]

GA Review[edit]

This review is transcluded from Talk:Clearview AI/GA1. The edit link for this section can be used to add comments to the review.

Nominator: Czarking0 (talk · contribs) 04:50, 30 March 2024 (UTC)[reply]

Reviewer: Mike Christie (talk · contribs) 22:17, 6 June 2024 (UTC)[reply]


I'll review this. Mike Christie (talk - contribs - library) 22:17, 6 June 2024 (UTC)[reply]

Running Earwig finds the following:

  • Source: "Clearview has created more than 200 accounts for users at five Ukrainian government agencies, which have conducted more than 5,000 searches. Clearview has also translated its app into Ukrainian ... from three agencies in Ukraine, confirming that they had used the tool. It has identified dead soldiers and prisoners of war, as well as travelers in the country,..." Article: "Clearview had created over 200 accounts for users at five Ukrainian government agencies, which have conducted more than 5,000 searches, and that Clearview has also translated its app into Ukrainian. Ton-That provided emails from officials of three agencies in Ukraine, confirming that they had used the tool to identify dead soldiers and prisoners of war, as well as travelers in the country." See WP:CLOP; this needs to be rewritten in your own words.
    The new wording is better, but I think is still identifiably a version of the original. How about "Ukrainian government agencies have used Clearview over 5,000 times, to identify dead soldiers, prisoners of war, and travelers"? The fact that Clearview created the accounts for them seems trivially obvious, and the "200 accounts" isn't as important in this context as the number of searches. Mike Christie (talk - contribs - library) 14:44, 9 June 2024 (UTC)[reply]

Will look at the sources next. Mike Christie (talk - contribs - library) 21:13, 7 June 2024 (UTC)[reply]

Sources:

  • Can we avoid the use of The Daily Dot? Per WP:RS/PS it's not a very good source. Here you're using it as one of three citations covering the same information; if the other sources cover the same ground I'd drop this one.
     Done
  • Similarly, the use of The Next Web is discouraged. This article seems to be just an opinion piece rehashing other sources, so not a great source regardless.
     Done Kept the claim but have a much better source
  • What makes cpomagazine.com a reliable source? The about page just says it is corporately owned, which is a good start, but does it have editorial control over what it publishes or is it a one-person operation? If we can't find that out, does it have a good reputation or get cited by other reliable sources?
     Done could not find material to establish RS. Kept the text for now. Will rework in another bullet point
  • What makes publicola.com a reliable source? Per the about page it seems to be a one-person operation.
    WIP I do not believe it is a one-person operation as the contact for the publication is not the author of the source. Working on establishing RS.
    I don't know if I should consider a reddit thread but this makes me think not RS
  • The New York Post is not a reliable source.
     Done
  • You cite Fight for the Future for a comment they made; it's a reliable source for that, but is the "shady surveillance vendor" comment notable enough to include in this article if nobody else mentions it?
     Done I think this is semi-notable as some other sources mention them. However, I grouped them under "other commentators" since the remarks from the senator are much more notable
  • What makes Biometricupdate.com a reliable source? You cite them as one of four sources saying some information "was not received positively", but I think rather than using the passive we need to say who did not receive it positively, and for that we need reliable sources. Biometricupdate.com's own reaction to the news is not noteworthy but if they're reliable then their report of others' reactions might be.
     Done The sourced articles showed some of the sources they used for their reporting. I did not make any determination on RS, but I did rework the article a bit so they are no longer a source.
  • How confident can we be that the document in documentcloud.org is authentic? What guarantees that? Is the claimed uploader authenticated?
     DoneThe document is contributed by Buzzfeed and is linked to in this Buzzfeed News Article.
  • FYI, the mississauga.com link is dead. This is not a problem for GA, but you may want to find an archived link for it.
     Done
    Looks like this is still an issue? I'm referring to FN 86. Mike Christie (talk - contribs - library) 15:23, 9 June 2024 (UTC)[reply]
  • FN 91 is described in the citation as 980 CFPL which is a name I can't find at the linked page; it seems to be a Global News page.
    980 CFPL is in the top tagline next to the author's name
  • techdirt.com appears to be a group blog, and hence not reliable.
     Done
  • What makes noyb.eu a reliable source?
    It is a source for a POV claim about the views of that organization. I think this is a question of notability of the POV claim not reliability? For notability, this is difficult for me to say. As far as privacy groups go are there really any truly notable ones? On the other hand, I think it would be a disservice to the reader to not include any remarks from self-described privacy advocates as some of them could be interpreted as notable to a reader in the more narrow privacy context. This group has gotten some attention and has their own WP article though I do not think is is very good. At least some think they are notable.
    I think that's good enough. Mike Christie (talk - contribs - library) 15:23, 9 June 2024 (UTC)[reply]

Once these are resolved I'll do a spotcheck. Mike Christie (talk - contribs - library) 21:52, 7 June 2024 (UTC)[reply]

Thanks Mike. Tracking progress in line, I hope you are ok with that Czarking0 (talk) 23:29, 7 June 2024 (UTC)[reply]

Sure; I have this watchlisted and will keep an eye. Will be intermittently busy the next few days but should be able to get back here whenever you're ready for me to look at the article again. Mike Christie (talk - contribs - library) 02:06, 8 June 2024 (UTC)[reply]
Ok Mike, I appreciate your insight. Your comments have made be understand several flaws with this article. I have responded to all your comments. If there are any changes that are unsatisfactory just let me know.
If you think it should just be failed here I would not be offended. However, if you want to keep the review going I will continue to work on it. Czarking0 (talk) 18:53, 8 June 2024 (UTC)[reply]
No need to think about failing it; questions about sources are very common in GA reviews. It'll be some time tomorrow before I can go through your replies but I'm sure the reliability issues can be sorted out, if there are any left over after the changes you've made. Mike Christie (talk - contribs - library) 21:47, 8 June 2024 (UTC)[reply]
I've struck most points; a couple of items left. Will read through and leave further comments next. Mike Christie (talk - contribs - library) 15:23, 9 June 2024 (UTC)[reply]

More comments[edit]

  • The first thing that strikes me about the article is the lists under the "Use" section. Compliance with the guidelines for list incorporation is one of the GA criteria; I think these lists are a problem. I see some of the entries on the list have no sources, which is an issue in itself, but overall I think it would be better to identify high profile examples of the different categories and either present the lists in prose, or make the lists much shorter -- no more than three or four in each of the three categories. E.g. "Clearview has been trialed by many law enforcement agencies, including the Royal Canadian Mounted Police and the New Zealand police, and was purchased by others, including the Swedish Police and the Metropolitan Police in London". That's assuming "many" can be sourced directly. I don't think the reader needs the full list unless we have evidence that the list is itself notable -- that is, that other sources find the list itself, rather than just some of the organizations on the list, to be notable.
    • I think you have a great comment here. I some research into this and I do think the list is sufficiently notable that it should be on WP. However, I think we should consider making a list class article separate and then giving a few examples. Another troubling point on this is that I believe BuzzFeed got exclusive rights to publish the list as obtained through "hacking", and I do not think they have actually published the list as a whole. This would make it more difficult to verify an article on the list itself since it seems that only the notable elements are published. I do think readers would like to check if institutions they care about are on the list. I also think readers who just care about what kind of customers the company has had could use a summary like "Their customer list includes X american police departments, Y federal law enforcement organizations, Z universities, and W international police departments." Czarking0 (talk) 19:41, 9 June 2024 (UTC)[reply]
  • The lead is a little short for an article of this length. I think it could be about twice as long as it is. The relevant guideline is WP:LEADLENGTH. A related point is that the history section starts "Clearview operated in near secrecy until ...": this assumes the reader already knows what Clearview is -- it's written as if the lead paragraph is the first paragraph of the body of the article. See WP:LEAD for general guidelines around the lead, but the basic idea is that the body of the article should be complete without the lead, and the lead should be a summary of the body. Here that's not the case.
    • will get to this after the other stuff. I like working on the lead at the end.

I'm going to hold off reading through and making more detailed comments until that's addressed; in the meantime I'll do the spotchecks:

  • FN 26 cites "Clearview's attorney, Tor Ekeland stated the flaw was corrected". Verified. I would suggest changing "the flaw" to something like "the flaw in their security", to be clearer.
     Done
  • FN 49 cites "the company has demonstrated its search can identify people while they wear a protective mask". I think this could be rephrased -- this source only says Ton-That successfully identified one person who had their nose and mouth covered, not necessarily with a mask. It's Gross who generalizes this. If there's another source for the more general statement I would use that, otherwise perhaps "Hill found that Clearview's search could identify him even when his nose and mouth were covered, as they would be with a COVID mask".
     Done
  • FN 38 cites "In October 2021 Clearview submitted its algorithm to one of two facial recognition accuracy tests conducted by the National Institute of Standards and Technology (NIST) every few months. Clearview ranked amongst the top 10 of 300 facial recognition algorithms in a test to determine accuracy in matching two different photos of the same person, instead of the test for matching an unknown face to a 10 billion image database, which more-closely matches the algorithm's intended purpose. This was the sole third-party test of the software at the time." Verified; some of the wording is pretty close to the source but the relevant phrases are hard to reword so I think it's OK. I initially misread the last sentence as saying there were no other ways to test the software, but I see it means that Clearview had not previously been tested by a third-party. Perhaps rephrase a little? And it might be worth mentioning that NIST had another more suitable test which Clearview did not submit to.
     Done
  • FN 20 cites "According to the BBC in 2023, few cases of mistaken identity using Clearview facial recognition have been documented, but "the lack of data and transparency around police use means the true figure is likely far higher." Ton-That claims the technology has approximately 100% accuracy, and attributes mistakes to potential poor policing practices. Ton-That's claimed accuracy level is based on mugshots and would be affected by the quality of the image uploaded." Verified.
  • FN 97 cites "In another Florida case, Clearview's technology was used by defense attorneys to successfully locate a witness, resulting in the dismissal of vehicular homicide charges against the defendant." Verified.

One minor rewording needed out of five checks; this is a pass for the spotcheck once that issue is fixed. Mike Christie (talk - contribs - library) 16:15, 9 June 2024 (UTC)[reply]

More comments:

  • Are there any images that could be used -- of any of the people named, perhaps? GA doesn't require images: the criterion is that the article be "illustrated, if possible", but I don't see any justification for fair use claims, so there may be nothing usable.
  • The material in the infobox is unsourced. This is OK if it's sourced in the article, but the founding date is unsourced, for example.
  • "the company maintained a low profile until late 2019, until its usage by law enforcement was first reported": it didn't become well-known till then, certainly, but is it accurate to say that it "maintained" a low profile? That would imply they deliberately avoided publicity, which might be true, of course.
  • Why is it relevant where Ton-That and Schwartz met?
  • 'Noted far-right "troll king"' is not a neutral description.
  • I think the history section could be organized a little more. Currently it's a series of paragraphs that range across several topics: the corporate history of the company; use by clients; lawsuits/cease-and-desist orders; and a couple of other things such as security. I would suggest pulling the purely "corporate history" sentences together under history, and grouping the other material under one or two more appropriate headings. You already have a "legal challenges" section; I don't think we need to repeat that material here, so perhaps just moving the non-history material would work. And the material in this section jumps around: for example, "The settlement with the American Civil Liberties Union" is mentioned as if we already know what lawsuit this is, but it hasn't been mentioned before.

I'm going to pause the review here, because I think addressing the structure will change the article quite a bit, and I'd like to wait till that's done before doing a full pass through. Mike Christie (talk - contribs - library) 17:28, 10 June 2024 (UTC)[reply]

  • I looked and was unable to find images of the founders that can be used. I am not sure what else would make a good image. I think potentially a graph from the NIST study?
  • Working on the other structural stuff. Czarking0 (talk) 21:02, 10 June 2024 (UTC)[reply]
  • It is accurate to say the company maintained a low profile as there are many documented cases of how they avoided journalism. Here is a quote from FN1

Clearview has shrouded itself in secrecy, avoiding debate about its boundary-pushing technology. When I began looking into the company in November, its website was a bare page showing a nonexistent Manhattan address as its place of business. The company’s one employee listed on LinkedIn, a sales manager named “John Good,” turned out to be Mr. Ton-That, using a fake name. For a month, people affiliated with the company would not return my emails or phone calls. While the company was dodging me, it was also monitoring me. At my request, a number of police officers had run my photo through the Clearview app. They soon received phone calls from company representatives asking if they were talking to the media — a sign that Clearview has the ability and, in this case, the appetite to monitor whom law enforcement is searching for.

  • Where they met is relevant because it helps explain the connection between one of the best facial recognition companies and the right wing. If they had met at a restaurant that would be less notable. The notability is further established by publication of this fact in NYT per FN1.
  • Agreed, is "right wing troll" more neutral? I had to read his WP page to know who he was so I am open to other interpretations.


Czarking0 (talk) 21:13, 10 June 2024 (UTC)[reply]

I've struck some points as your answers above address them; feel free to post after each point if you like (I think it's easier to follow the individual answers that way). FYI, the Wikipedia indenting syntax is pretty opaque, but there's a simple rule that helps: copy whatever the last indent was (e.g. "*" or "*:" or whatever) and then add a ":" for indent and a "*" for an indented bullet. It's worth getting right per WP:INDENTMIX because otherwise it becomes a mess for non-sighted readers who use screen readers, which don't handle mixed-up indents very well. So if you want to reply to a bullet point of mine, with a "*", you'd put "*:" to reply with an indent but no bullet, and "**" to reply with an indented bullet. Then I might reply to that with "**:".

Re your last point, I think we can source "right-wing" easily enough, but "troll" is a POV term that we can't use per WP:NPOV (for which a good summary is that it should be impossible for a reader of the article to tell where the sympathies of the writers of the article lie). We need to be accurate, though we don't have to be complimentary if the facts aren't complimentary. I don't know this person so I don't know what the right description is, but something like "Right-wing blogger" would be fine. If we need to emphasize that they deliberately post things with the intention of causing trouble, we need to find a source that states that factually and cite that in support of the description. Mike Christie (talk - contribs - library) 22:37, 10 June 2024 (UTC)[reply]

From his page "Johnson is often described as an internet troll and has been repeatedly involved in the proliferation and spread of multiple fake news stories." This has three sources which I believe are reliable. So maybe this a matter of including those in this article? Czarking0 (talk) 23:29, 10 June 2024 (UTC)[reply]
I think there's a difference between "has been described as a troll" and "is a troll"; one is a factual description, and the other is an opinion. I think it would be better to say he is known for spreading fake news stories, and leave the word "troll" out of it. But why do we need to even mention Johnson as a customer? What does it tell us about Clearview? Is there some evidence of collusion or political leanings on Clearview's side, that they gave him an account? At the moment the sentence just says "Hey, look, this troll had an account", which feels like tainting by association. Mike Christie (talk - contribs - library) 00:35, 11 June 2024 (UTC)[reply]

Ok I believe that the history, marketing, and legal challenges sections are now significantly improved. I am most curious if you have other concerns with those sections?

If not I will move on to the list. I really do think it is notable; however, as you pointed out there are quite a few lines that are not sourced. I can go confirm/remove all those and then we will see where we stand? Czarking0 (talk) 06:02, 12 June 2024 (UTC)[reply]

I hope to find time to read through again this evening and will comment again then. Re the list, I would recommend making a separate List of Clearview AI users if you think it's notable. I'm doubtful: I can see why it *might* be notable, but the fact that so many entries are unsourced and will have to be sourced individually implies that the list as a whole is not treated as a single reportable entry by most sources. I do think it should be trimmed to just the sort of prose paragraph I gave as an example above. You can save it on the talk page if you want to keep it around while deciding whether to make a separate list article of it. Mike Christie (talk - contribs - library) 11:29, 12 June 2024 (UTC)[reply]
Did you see my comment that the list was exclusively obtained by BuzzFeed and they have decided not to publish it in its entirety? To me this seems like a journalistic strategy rather than anything about the notability of the list as a whole. I could be wrong though. Czarking0 (talk) 22:21, 12 June 2024 (UTC)[reply]
I did. I also saw your comment about readers wanting to check if a particular organization uses the software. I think reasonable people can disagree on this one, but at the moment I think the article would be better without the list. I'd be OK with a shorter list of maybe half a dozen of the most prominent users, perhaps in addition to the short paragraph approach I suggested above. Mike Christie (talk - contribs - library) 23:06, 12 June 2024 (UTC)[reply]
That seems like a good middle ground. I will work towards that. Czarking0 (talk) 03:31, 13 June 2024 (UTC)[reply]
Ok everything in the list is sourced and it is now much shorter. Czarking0 (talk) 17:30, 13 June 2024 (UTC)[reply]

Another read through[edit]

  • "It maintained this secrecy by exerting significant influence on what information can reported on. For example, they have called police officers to ask them why they were communicating with journalists and the founders tried to erase all their social media presence." The support for this seems to be the quote given in FN 14: "I see you have a lot of photos on the internet you should be in the app but you're not here... A couple of minutes later he said he got a call from someone who worked for Clearview AI and they wanted to know why he'd been running my photo." I don't think this works: this doesn't say they successfully exerted influence -- it just gives a single example of a call they made. The BuzzFeed News article describes them as unresponsive to some press enquiries and with some deleted history, but that's not the same thing.
    Edited the claim to better reflect sources.
    What's the source for "discouraging users from talking to the press"? The NYT article says they called the police departments who ran the journalist's photo, and the quote from the audio is similar; those certainly indicate that Clearview wanted to know about media interest but there doesn't seem to be anything saying they told users to avoid talking to the media. Am I missing something elsewhere in the sources? Mike Christie (talk - contribs - library) 11:12, 15 June 2024 (UTC)[reply]
    No you have it right, I agree I went to far here is saying they discouraged users from talking to the press. The fundamental point I am trying to get across is that the company was secretive. This is mentioned in nearly every source when they introduce the company. On the other hand it is hard for me to point to a fact that demonstrates how they are secretive. I am hesitant to ignore the number of sources that call them secretive since these reliable sources are probably better able to judge that than I am even if they don't publish an analysis of what makes them secretive. Maybe the middle ground here is to leave it at "publishing fake information about the company's location and employees" which is verifiable? Czarking0 (talk) 16:13, 16 June 2024 (UTC)[reply]
  • I suggest moving the definition of what the software does to the first paragraph of the body.
     Done
    I don't think this is done -- it's in the lead, but the first para of the body needs to say it since the lead is supposed to be a summary of the body. Mike Christie (talk - contribs - library) 11:12, 15 June 2024 (UTC)[reply]
    By the first paragraph of the body, do you mean the history section? I feel like it does not really fit there. I moved it to the beginning of the usage section. Is there potentially value in putting usage above history? I don't know that the average reader cares so much about the corporate history so I could go either way on that. Czarking0 (talk) 15:50, 15 June 2024 (UTC)[reply]
    After thinking about it some more I think this is a stylistic choice so I'm going to strike this point. I do think it's good to give the reader the key information early in the body. Mike Christie (talk - contribs - library) 18:39, 15 June 2024 (UTC)[reply]
  • I've put some of the corporate history sentences together in a shortened "History" section, and put the remaining material in a "Usage" section -- let me know if you think that works. It seemed easier to do that than to try to explain which paras I thought went together.
     Done
  • "demonstrated Clearview's expansive, multi-year collaboration with the NYPD. These records demonstrated, contrary to past NYPD denials, that Clearview provided accounts ...": suggest "demonstrated that Clearview had collaborated with the NYPD for years, contrary to past NYPD denials. Clearview provided accounts ...".
     Done
  • The first paragraph of "Marketing efforts and pushback" covers the NYPD; this was already discussed a couple of paragraphs earlier. Can we combine the discussions into one paragraph?
     Done techincally not one paragraph but I think I hit the spirit of your comment
  • "The company markets directly to police officers by encouraging them to "run wild" by searching for family, celebrities, and suspects": this was an email directly to an officer in Green Bay in 2019 -- they probably did send it to multiple recipients, but we don't know that, so we can't phrase this as a general statement. We can give the quote as an example of how they marketed themselves, but I think it should be clear in the body of the article (rather than by following the citation) that this is taken from an email to one of their clients (rather than an exhortation in a brochure or posted on their website, for example).
    Changed this, I think it is better now?  Done
  • "Clearview had claimed that its app played a role in a New Jersey police sting, which Grewal confirmed had been used to identify one of the child predators": why "one of the child predators"? We haven't mentioned any child predators before. And I'm not clear from this whether Grewal was confirming they did use Clearview in the sting, or just confirming that the sting identified a child predator.
     Done
  • Why do we mention Jessica Medeiros Garrison? It seems from the citation that Clearview is owned or part-owned by MDM27 Holdings; if we have a good citation for that it can go in the history section. But Garrison's name doesn't help the reader at all unless we have more information about her or MDM27 or there's a link we can add.
     Done
  • "Documents from Clearview have claimed 98.6% or 100% accuracy using a 99.6% confidence interval." This doesn't make sense. A 99.6% confidence interval means that Clearview assert that the image will be correctly identified 99.6% of the time. "Interval" is not the right word anyway, since this is not across a range of a parameter -- when one says "x lies between 10 and 15, with 99.6% confidence", that's a confidence interval, because 10-15 is an interval. In this case we're just talking about a confidence level. But that's just a claim that Clearview make about accuracy, so it doesn't make sense to say they claim either 98.6 or 100% accuracy when claiming 99.6% confidence.
    This could get really nuanced which makes me want to back up for a moment. Did you have a background in this? My masters is in statistical experiment design. FN61 shows that the 99.6% confidence interval is not the alpha for the accuracy it is the alpha for the match of the input face to the result. The reported accuracy depends on the alpha for the match. Obviously the WP article as it currently stands is not communicating this. Overall I am somewhat opposed to quantitatively stating the accuracy in WP as a single number. That is really over simplifying the system. However, getting detailed with the performance seems quite technical for WP and probably not what the typical reader of this article is looking for. I would prefer to report more of a summary along the lines of "it is one of the best in the world" given appropriate sourcing.
    As a side note I think you have some misunderstanding of CI. "x lies between 10 and 15, with 99.6% confidence" is not strictly true (depends on what "confidence" really means). The better summary is: x either lies in the interval or not with probability 1. Intervals constructed at a given alpha (99.6% in this case) contain the target parameter with alpha frequency. This is not the same as a specific CI containing a specific parameter.
    My degree is in pure mathematics, with a bit of post-graduate study; no, I'm certainly no expert in statistics and am happy to concede your points about the imprecision of my comments. What I was trying to get at was that the lay interpretation of a single % confidence number is "the odds are this % that this is correct". I may be wrong, but I doubt the sources are precise in the way you discuss. The article says "Documents from Clearview have claimed 98.6% or 100% accuracy using a 99.6% confidence interval." The source for the 99.6% figure appears to be this statement in the test document: "Unlike Amazon’s Rekognition, Clearview does not allow the user to set the confidence level, but instead is fixed at 99.6%." The 98.6% figure comes from the BuzzFeed News article which says "In marketing materials to Atlanta police, Clearview claimed that it could accurately find a match 98.6% of the time in a test of 1 million faces." I don't think we can combine these two statements as the article does: these seem to me like layperson statements about accuracy, not precise statistical assertions. Mike Christie (talk - contribs - library) 11:30, 15 June 2024 (UTC)[reply]
    I think you are right here. What do you think of my suggestion to shift to more qualitative claims? We could go with the claim in their marketing material, but I don't love using their own marketing for the reported accuracy. I could dig into into the test document a bit more to see if there is a more reliable summary statistic that is useful to the layperson.
    I don't think it's worth the trouble; the numbers are probably all nonsense anyway. Can we say something like "At various times, Clearview have claimed 98.6%, 99.6%, and 100% accuracy"? The word "claim" avoids the implication this is anything more than marketing. Mike Christie (talk - contribs - library) 18:28, 15 June 2024 (UTC)[reply]
     Done Czarking0 (talk) 04:26, 16 June 2024 (UTC)[reply]
  • "Ton-That claims the technology has approximately 100% accuracy": another, different, accuracy claim. We should probably put dates on these claims, so the reader doesn't think they're simultaneous and hence inconsistent.
  • "The Android version contains references to": I think it should be clearer to the reader that the functionality described in the next couple of sentences was found by reading the code of the app, and the reporters weren't able to demonstrate working functionality.
    what is the implication here? That this may not actually be the code for the app? All those claims would still be relevant if it was an outdated version. Though I am not 100% convinced of notability
    I think it's notable enough to include -- it shows intention on Clearview's part to do these things, and a reader would certainly be interested in that. It doesn't prove they successfully implemented these functions to the point that they worked, or that any user ever used them -- it's common to include software in a released product that is draft or inaccessible, and can't be accessed -- for example in a library of functions. How about making it "... an examination of the code for the Android version revealed references to ..."? That would make it clear to the reader that the source never saw it working in practice. Mike Christie (talk - contribs - library) 11:40, 15 June 2024 (UTC)[reply]
     Done
  • "Clearview also operates a secondary business, Insight Camera": we say "operates", but should it be "operated"? The website is no longer up.
    some additional googling point to the claim that Insight Camera's website was taken down after the press started asking about their connection. That seems speculative to me, but also points to the fact that taking down the website does not necessarily imply ceasing to operate the venture. Maybe this is notable in itself? Like we say "operated" but note the website was taken town and the company has not publishing anything since being contacted?
    Hmm, not sure what the best option is here. If you have a reliable source for the site being taken down after the publicity started, maybe say that? Mike Christie (talk - contribs - library) 11:40, 15 June 2024 (UTC)[reply]
     Done
  • How about combining the "cases" section under the lists with the "usage" section I created? In either location -- maybe moving the "usage" stuff down to under the lists would make the most sense. That way the article organization would be corporate history, then the technology itself, then uses and marketing, and finally the legal challenges. That seems a logical sequence to me -- what do you think?
    agreed, will do  Done
  • The legal challenges section is very fragmented; I think this is because it sticks strictly to chronological order. See WP:PROSELINE for an essay giving advice about this sort of prose. Can we make it a bit more thematic? E.g. start with a para saying multiple states and organizations have sued Clearview, and give examples; then give details for any that seem important enough; then cover any other information such as fines and rulings (e.g. the EU's decision that their photo database was illegal). The mention of the particular lawyers they hire might go in the corporate history section, but if not then I'd put those separate mentions together -- e.g. "Clearview's lawyers have included Tor Ekeland, Paul Clement, and Floyd Abrams", and then give dates if available and relevant, and any quotes.

That's it for this pass. The lists look fine now. I think the main problem with the article initially was organization, which is why it's taking multiple passes for me to give you this feedback. It's getting there, though. Mike Christie (talk - contribs - library) 22:05, 13 June 2024 (UTC)[reply]

  • This sounds good. On my todo list. Made some progress still WIP. Czarking0 (talk) 17:00, 15 June 2024 (UTC)[reply]

Czarking0 (talk) 02:46, 15 June 2024 (UTC)[reply]

Looks like you're still working on a couple of points; I've gone through and struck or replied to the points you've dealt with. Mike Christie (talk - contribs - library) 11:43, 15 June 2024 (UTC)[reply]
Replied to your first point about discouraging users. I worked on the legal history a bit more and I think it is better now. I am not sure that it is sufficient. Can I get some more feedback there? I believe that covers all the points made here.
FYI there is some breaking news about another settlement that is notable; however, I believe the story is not sufficiently settled to include it at this time. Czarking0 (talk) 16:16, 16 June 2024 (UTC)[reply]

Final pass[edit]

You've done so much to improve the article that I'm not going to go through and strike the remaining points above; I'll just read through again and note any outstanding issues here. I did read your comments above and will include responses below.

  • Re secrecy, I think you have pretty good citations that say how secretive they are -- the NYT article is titled "The Secretive Company That ..." after all, and we quote that title in this article. I think we can drop "discouraging users from talking to the press" without changing the message to the reader that this was a company that did not want media scrutiny.
  • "Clearview came under renewed scrutiny for enabling officers to conduct large numbers of searches without formal oversight or approval." This is now uncited; I suspect it got detached from its citation when you were moving text around.
  • "What Clearview does is mass surveillance and it is illegal. It is completely unacceptable for millions of people who will never be implicated in any crime to find themselves continually in a police lineup." This seems to be a quote but it's uncited. Is it from Therrien? If so I'd tack it on to the previous sentence, in quotes, rather than indenting it as you have here: "... hundreds of illegal searchs using Clearview AI, and said "What Clearview does is mass surveillance ..." and then add whatever the relevant citation is.

That's it for this pass. I read through the legal section again; a couple of bits of info have been moved elsewhere and I think this is OK now -- it's still a bit fragmented but that's just the nature of the information that has to be conveyed. Mike Christie (talk - contribs - library) 18:58, 16 June 2024 (UTC)[reply]

Ok I addressed these points. I appreciate the attention. Czarking0 (talk) 03:53, 17 June 2024 (UTC)[reply]
Fixes look good. This is GA quality now, so I'm passing it. Congratulations, and thank you for being patient with my nitpicking. I also want to say that the reason I picked this article to review was that I saw you'd done quite a few reviews yourself -- I like to prioritize reviewing nominations by editors who are also contributing to the reviewing side of GA, so thank you for those reviews. Mike Christie (talk - contribs - library) 09:33, 17 June 2024 (UTC)[reply]
Thanks Mike, this is my first GA so I am very happy this morning. I'll certainly be doing more reviews in the future! Czarking0 (talk) 14:29, 17 June 2024 (UTC)[reply]

Evansville police personal usage[edit]

ars technica article for expansion tedder (talk) 15:41, 14 June 2024 (UTC)[reply]

Johnson and new source[edit]

Moving a comment here from the GA review -- FeralOink, when a GA review is completed it's best to add comments to the article talk page rather than the review, since the review may be archived and hence not immediately visible to editors looking at the article talk page. Mike Christie (talk - contribs - library) 11:28, 21 June 2024 (UTC)[reply]

I wanted to clarify one point and also bring attention to an important new source that Czarking0 added less than a month ago. First, I agree that it was not appropriate to describe Charles Johnson as a troll here. Mike Christie, you were correct! Do note though that the reason we need to mention Johnson is that he was a co-founder and partial owner NOT a customer. Mike had written above: "...but why do we need to even mention Johnson as a customer? What does it tell us about Clearview? Is there some evidence of collusion or political leanings on Clearview's side, that they gave him an account?"

Mike's points are still valid, about not placing undue emphasis on Johnson, merely a passing mention, which is exactly what the article currently has. Here's why: Johnson v. Clearview AI, Inc. dated May 20, 2024. That's the source recently added by CzarKing1. See pages one through three. It states there that the company was founded in Feb 2017. On 24 Nov 2018, Ton-That and Richard Schwartz removed Johnson from the company, although Johnson retained 10% ownership but no longer had any role in running it. This new source, about the outcome of the lawsuit as documented in the Justia link, provides additional information for someone to add to the article in the section about legal matters. To summarize, Johnson sued the other two because he said they agreed to give him money for marketing after he left in 2018, and that he didn't get as much as he expected. Three out of the four counts by Johnson against the two Clearview guys were dismissed by the court. (Clearview isn't trying to get the first count dismissed.) I'm not sure if this is the right place for these remarks, so you can move them (or I will, if you prefer) to a new section if that is better.--FeralOink (talk) 11:08, 21 June 2024 (UTC) Mike Christie (talk - contribs - library) 11:28, 21 June 2024 (UTC)[reply]

Hey Feral, I glad you brought extra attention to this. I had not added the Johnson v Clearview lawsuit to the legal cases section because I did not see sources that indicate that suit's notability. In fact there are several law suits that Clearview was involved in which are not mentioned. This is because I did not think it was justified for this company profile to dig through the court fillings themselves unless another source showed that the info in the court filings was notable and there were factual matters to verify in the fillings.
I did reference this suit in the history ownership section since the owners of the company and how it was founded are demonstrably notable via other sources and that suit seemed like the best source to verify Johnson's ownership. If you think more should be said about Johnson let me know and I am happy to dig into it more. Czarking0 (talk) 20:06, 21 June 2024 (UTC)[reply]