Jump to content

Wikipedia:Link rot/URL change requests/Archives/2020/July

From Wikipedia, the free encyclopedia


IHF

The International Handball Federation made a complete new website and moved the old content from www.ihf.info to archive.ihf.info. Is it possible that a bot change all old pages for all Wikipedia Editions to the archive.

For example www.ihf.info/en-us/ihfcompetitions/competitionsarchive/menworldchampionships.aspx moved to archive.ihf.info/en-us/ihfcompetitions/competitionsarchive/menworldchampionships.aspx.

The old link will display a 404 error page. They have only to change the www to archive. So it's possible to change all links? And in the same bot run could the bot archive all this archive.ihf.info pages? Thank you very much. --Malo95 (talk) 10:49, 16 June 2020 (UTC)

Malo95, no problem. I will do this. There are others backing up above so will get to this in turn it takes some time. -- GreenC 18:43, 16 June 2020 (UTC)
I just checked and this domain is only in 4 templates. Because of transclusions it will look like it's used on a lot more. I went ahead and manually changed those 4 to "archive" and it should be good to go. -- GreenC 19:00, 16 June 2020 (UTC)
@GreenC: Thank you for your try but there are many more old links at the wiki. For example at the 1938 World Men's Handball Championship the External link should also point to archive. There are 1642 links to www.ihf.info, which the older ones should point to archive.ihf.info. And also in other language wikis like the German Wiki are old links. Could you please look again.--Malo95 (talk) 13:49, 21 June 2020 (UTC)
Sorry my search was limited to template space, see them now. Will process when the bot is free in a few days. It is only configured for enwiki templates. -- GreenC 15:09, 21 June 2020 (UTC)
@GreenC: I saw you made the first run. But I saw some problems for example at the Handball article.
Not all url have to be changed to the archive only this which have a 404 error page. The others are still working. Because they create the new webpage a year ago and all new links will work fine.
You wrote your bot works only at the English wiki. Were I have to ask to run a bot over all language wikis and other wikiprojects like commmons?--Malo95 (talk) 15:41, 22 June 2020 (UTC)
Glad you saw those edits (about 5). I've modified so if the original URL is 200 it skips. I'm not aware of anywhere else on other wikis providing this type of service. Maybe try a request at the German equivalent of WP:BOTREQ "Wikipedia:Bots/Anfragen" (mention the mix of 404 and 200 among the source URLs). -- GreenC 18:53, 22 June 2020 (UTC)
Another complication.. sometimes when loading the new archive.ihl.info page, it redirects to a home page ie. a soft-404. Those cases should be converted to an archive URL of the original URL. Example. -- GreenC 22:59, 22 June 2020 (UTC)
For Commons, I have the ability, but there are only 11 is better done by hand. -- GreenC 18:58, 22 June 2020 (UTC)
Malo95: The bot posted another 20 pages. Can you review before proceeding? -- GreenC 23:08, 22 June 2020 (UTC)
Done for about 3,000 cites in 900 articles. It found roughly a 3:1 mix of converting to the new URL vs converting to an archive URL. -- GreenC 19:47, 26 June 2020 (UTC)
Thank you very much. It looks fine for me. I'll do the commons once manually. For the German Wiki, are following information enough? Steps:
  1. Open a link from the article
  2. If(200) aboard; else redirect to archive.ihf.info
  3. If(404 OR redirect to the index page OR The resource you are looking for has been removed, had its name changed, or is temporarily unavailable.) link is dead use web.archive.org; else change given link to archive.ihf.info.
--Malo95 (talk) 15:28, 27 June 2020 (UTC)
Yes those are the steps. The "temporary unavailable" [1] returns 404. So it's pretty easy verify there is not 404 or 301 at archive.ihf.info -- GreenC 16:25, 27 June 2020 (UTC)
Hi Malo95, GreenC, please note that .../2MTR.pdf (instead of /2OMR.pdf) works. However I don't know if it will be true in any cases. If it's not too late, you can try to replace /xOMR.pdf by /xMTR.pdf (or maybe /0xMTR.pdf if x<10, I think that I previously saw such a case, not sure). BTW, I think that MTR match reports are better than OMR ones because there are more detailled statistics and the running score. --LeFnake (talk) 12:47, 1 July 2020 (UTC)
Found at least 3,338 unique URLs that have "xOMR.pdf" there might be more. A quick test shows most of them could be successfully converted to "xMTR.pdf" which are status 200 .. can you confirm this what you want to do? It would only do so if the original URL is 404. @Malo95 and LeFnake: -- GreenC 14:13, 1 July 2020 (UTC)
Sounds good to me, but I let @Malo95: confirm --LeFnake (talk) 14:41, 1 July 2020 (UTC)
I'm also good if all reports are changed to MTR --Malo95 (talk) 14:57, 1 July 2020 (UTC)

@Malo95 and LeFnake:, Turns out there are not 3,338 unique URLs but more like 355. And of those, only 8 successfully converted to MTR, the rest did not work. I made some mistakes with the assessment initially, the bot did the right thing. It only works when URLs have a /CompetitionData/05459bd8-a610-45d1-87a9-172e0b699e38/ but not with this form /CompetitionData/##/ which is all but 8. Example non working: https://archive.ihf.info/files/CompetitionData/152/pdf/50MTR.pdf -- GreenC 17:23, 2 July 2020 (UTC)

BFI

There are hundreds of film and TV-related articles with citations to website ftvdb.bfi.org.uk (to pick an example, Tread Softly (1965 film) refers to http://ftvdb.bfi.org.uk/sift/title/49683). However, all these links are broken, because the BFI closed down that website. Some of the ftvdb articles have been transferred to the BFI's main website, but there's no discernible connection between the identifying string used on the BFI website and the string used on the old ftvdb. For example, Ethan Frome was at http://ftvdb.bfi.org.uk/sift/title/474191 and is now at https://www.bfi.org.uk/films-tv-people/4ce2b7be1c909. Many of these ftvdb links are still accessible through Wayback (see https://web.archive.org/web/20121022014637/http://ftvdb.bfi.org.uk/sift/title/49683 for example). Since there's no way of mapping from the old article locations to the new ones, what's needed is a bot to add the archived version to these citations, if an archived version exists, and populate a new category for those where no Wayback archive exists so that they can be fixed manually by finding replacement sources.

(see discussion at Wikipedia_talk:WikiProject_Film#BFI_ftvdb_website)

Colonies Chris (talk) 08:23, 3 July 2020 (UTC)

Moved from Bot Requests. --Izno (talk) 12:32, 3 July 2020 (UTC)

Thank you, Izno. Chris, I can archive the existing links and provide a list of the URLs (and articles) the bot is unable to archive. -- GreenC 14:29, 3 July 2020 (UTC)

That would be very helpful, thanks. Colonies Chris (talk) 17:36, 3 July 2020 (UTC)

Completed 5,738 citations in 4,463 articles converting ftvdb.bfi.org.uk to archive URLs (about 5% non-Wayback archives). Fixed everything with a {{dead link}} except these:

Set IABot database to blacklist. -- GreenC 16:38, 4 July 2020 (UTC)

explore.bfi.org.uk and old.bfi.org.uk

Converted 2,049 explore.bfi.org.uk to archive URLs, and 251 old.bfi.org.uk. There are 20 explore with no archive available (added {{dead link}}), and 1 old. Set both to blacklist in IABot interface. -- GreenC 12:17, 5 July 2020 (UTC)

wikia.com

Convert *.wikia.com to *.fandom.com -- GreenC 02:07, 8 July 2020 (UTC)

Special case: fandom.wikia.com -> fandom.com, e.g. [2] ProcrastinatingReader (talk) 13:48, 15 July 2020 (UTC)

westmidlandbirdclub.com

For all citations to pages under westmidlandbirdclub.com, which should already be marked as archived, please add |url-status, thus, as the domain has been cyber-squatted. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:07, 14 July 2020 (UTC)

@Pigsonthewing:: It is done, changed 63 url-status to unfit. Except for these 12 that do not use cite templates:

-- GreenC 03:25, 17 July 2020 (UTC)

@GreenC: Thank you. I've cleaned up the remaining dozen. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:37, 18 July 2020 (UTC)

The links of sources for the land uses of the municipalities (within the geography section) in switzerland point to a web page that is no longer available (e.g. Bulle) and only some of them have been linked to wayback machine. Can someone link the rest of them to wayback machine?--Horizon Sunset (talk) 14:58, 23 July 2020 (UTC)

Horizon Sunset: Yes that is possible. Ideally the links would point to the new site, for example the URL would be [3] (compare with the old link). But there is no redirect info to automate the conversion.. too bad as this sort of info should be live since it is stats. -- GreenC 15:28, 23 July 2020 (UTC)
@GreenC:Thanks. So should I update the information of each municipality using the new link you provided?--Horizon Sunset (talk) 18:38, 23 July 2020 (UTC)
Looks like a lot.. Maybe do a handful manually to see if there is a pattern between old and new URLs. There may be no way, the URLs look complex, but worth a try. Sometimes there are discernible patterns. For example "gemeindedaten.html" in the old matches "gemeinden.html" in the new sans "daten". -- GreenC 19:28, 23 July 2020 (UTC)
I already did several of them. I'm not quite sure if "portal" matches "home". I didn't find any patterns other than that, but I'll try to update the stats for each municipality.--Horizon Sunset (talk) 22:55, 26 July 2020 (UTC)