| ||
| Views: 18,561,400 |
Home
| Forums
| Uploader
| Wiki
| Object databases
| IRC
Rules/FAQ | Memberlist | Calendar | Stats | Online users | Last posts | Search |
08-08-23 04:16 PM |
| Guest: Register | Login | ||
| 0 users reading Evil OpenAI web crawler | 1 guest |
| Main - Computers and technology - Evil OpenAI web crawler | Hide post layouts | New reply |
| fruityloops |
| ||
|
Newcomer Normal user Level: 1
Posts: 4/4 EXP: 10 Next: 1 Since: 08-07-23 From: Germany Last post: 23 hours ago Last view: 56 min. ago |
OpenAI is using crawlers to yank training data from your sites, without even knowing the licensing of the content they are stealing.
So if you want to prevent this, you can put the following in your robots.txt: User-agent: GPTBot
Disallow: / Alternatively, you can give it garbage data to fuck up their training data (example with nginx config): if ($http_user_agent ~* "GPTBot") { return 200 'asdlkfjsdjklfjsdlkfjsdkjfhgdfskjhgfd'; # realistically, you would put something more convincing than just a keyboard smash } |
| Main - Computers and technology - Evil OpenAI web crawler | Hide post layouts | New reply |
|
Page rendered in 0.011 seconds. (2048KB of memory used) MySQL - queries: 26, rows: 105/105, time: 0.007 seconds.
Acmlmboard 2.064 (2018-07-20)© 2005-2008 Acmlm, Xkeeper, blackhole89 et al. |