GitHub to begin training AI models on user data: Here's how to save yourself

GitHub to begin training AI models on user data by default next month

By
Geo News Digital Desk
|
GitHub to begin training AI models on user data: Here's how to save yourself
GitHub to begin training AI models on user data: Here’s how to save yourself

GitHub has rolled out a major policy change recently with the announcement of utilising customer interaction data automatically to train its AI model.

The user data will be utilised automatically from April 24.

The Microsoft-owned platform said the updates are applicable across Copilot Free, Pro, and Pro+ users, with interaction data including code snippets, inputs, outputs, repository context, chats, and feedback being leveraged to enhance model performance.

Copilot Business, Copilot Enterprise, students, and teachers are exempt from the change.

Git Hub’s Chief Product Officer, Mario Rodriguez, stated: “By participating, you’ll help our models better understand development workflows, deliver more accurate and secure pattern suggestions, and improve their ability to help you catch potential bugs before they reach production.”

How to opt out

Developers who do not want to share their data are urged to follow these steps before April 24:

  • Log in to GitHub and navigate to Settings
  • In the left sidebar, click Copilot
  • Select the Features tab
  • Scroll to the Privacy section
  • Locate the option labeled "Allow GitHub to use my data for AI model training"
  • Set the toggle to Disabled

Users who have already selected this option will have their preferences retained automatically.

What data is GitHub using?

GitHub may collect and utilise data, including inputs sent to Copilot and code snippets, outputs accepted or modified by users, code context and repository structure, comments, documentation, and navigation patterns.

It is anticipated that users’ feedback ratings can also be utilised by the platform.

What data is not being used?

Amid raising safety concerns, GitHub clarified that the program will not access any interaction data from Copilot Business or Enterprise users, content from private repositories at rest, or data from users who have opted out.

It is also made clear that the data may only be shared with affiliates, including Microsoft, but not with any third-party AI model providers.