Ming Min Xiao Xiao is from the Cave Fei Temple
Qubits | Official account QbitAI
Less than 20 days after Samsung introduced ChatGPT, there were 3 data leaks?!
Two of them were related to semiconductor equipment and one was related to an internal meeting.
As soon as the news was released, it immediately aroused heated discussions from all walks of life, especially in South Korea. The Korean edition of The Economist article directly reads:
As a result, semiconductor equipment measurement data and product yield were transmitted to American companies intact.
Korean media even said that because Samsung employees directly enter confidential corporate information into ChatGPT by asking questions, it will cause relevant content to enter the learning database, which may leak to more people.
According to Samsung, to avoid this happening again, they have told employees to use ChatGPT with caution. If a similar incident still occurs later, consideration will be given to banning the use of ChatGPT on the company's intranet.
It seems that Samsung has made big news this time.
Some netizens jokingly called: xx network inventory company internal documents (doge).
However, some netizens found this strange.
How did they know that the data had been breached? Did ChatGPT release a new version so quickly?
When the "Economist" reporter confirmed the authenticity of the news to Samsung, the relevant person in charge said that because it was an internal accident of the company, it was difficult to give a clear reply.
At present, the news is mainly followed by the South Korean media, and the details of how the so-called chat content was added to the learning database remain to be scrutinized.
So, as the report says, this data will be used by ChatGPT for training and more people to see?
Is the data uploaded to ChatGPT secure?
3.1% of workers are feeding ChatGPT corporate data
Samsung has caused heated discussions this time, and the key point is to upload internal semiconductor data to ChatGPT.
On March 11, Samsung's Semiconductor Business and Device Solutions Division (DS) allowed employees to use ChatGPT, and in the following 20 days, there were three incidents of uploading internal semiconductor data to ChatGPT:
Among them, employee A uses ChatGPT to help himself find a piece of code bugs, and this source code is related to semiconductor equipment measurement data; Employee B wants to use ChatGPT to optimize a code for himself, so he directly enters a code related to the yield and yield recording equipment;
Employee C first used the AI voice assistant Naver Clova to convert his meeting recording into text, and then used ChatGPT to help him summarize the meeting content and make a summary...
At present, Samsung has taken "emergency measures" internally, limiting the number of topics communicated with ChatGPT to no more than 1024 bytes per content upload, and also revealed its intention to develop the company's internal AI.
However, it is also worth noting that most of these messages are being followed by South Korea, and OpenAI has not yet responded.
However, OpenAI's updated data usage notes last week did mention that for non-API products such as ChatGPT and DALL-E, the platform does use user data to further improve the model.
In the case of API products, it is determined that the user-submitted data will not be used.
And ChatGPT is not just a Samsung employee who has done it.
According to statistics, many enterprise employees are transmitting company data directly to ChatGPT and letting it help with processing.
Cyberhaven counted the use of ChatGPT by 1.6 million employees and found:
3.1% of migrant workers will directly input internal data into ChatGPT analysis.
Cyberhaven, a data analytics service provider, has developed a way to protect enterprise data, helping companies observe and analyze data flows and understand the causes of data loss in real time.
They found that as ChatGPT became more widely used, so did the number of migrant workers uploading corporate data to it.
In one day alone (March 14), an average of 5,267 corporate data were sent to ChatGPT per 100,000 employees:
So how much sensitive data is there?
The data shows that 11% of the corporate data sent directly to ChatGPT by employees is sensitive data.
In one week, for example, 100,000 employees uploaded 199 confidential documents, 173 customer data, and 159 source code to ChatGPT.
Uploading data is one thing, it doesn't mean it will be used, but data security is another.
ChatGPT's recent leakage of user information bugs has made many companies pay attention to this point.
ChatGPT has had a data leak bug
In fact, in order to avoid the risk of data breaches, many companies have explicitly prohibited employees from using ChatGPT.
For example, SoftBank, Hitachi, Fujitsu, JPMorgan Chase and so on have issued relevant notices.
TSMC, which is also a major chip manufacturer, also said a few days ago that employees are not allowed to disclose company-specific information when using ChatGPT and pay attention to personal privacy.
The Italian Personal Data Protection Agency has also announced a ban on the use of chatbot ChatGPT and restrictions on the processing of Italian user information by OpenAI, the company that developed the platform.
Part of the reason for these panics has to start with ChatGPT itself.
At the end of March, ChatGPT was exposed to a bug that would lead to the loss and leakage of user conversation data and payment information.
This led to the brief shutdown of ChatGPT.
OpenAI's response said that the vulnerability could have exposed the payment information (including user name, email, payment address, last four digits of credit card number, and credit card expiration time) for about 9 hours for 1.2% of ChatGPT Plus users.
The vulnerability also causes the user's conversation topics and records to be seen by others, and if it contains private information, there is a risk of leakage.
OpenAI CEO Sam Altman immediately responded that the bug came from an open-source library that they use to cache user information in their servers.
At present, the specific number of users affected by the vulnerability is not clear, OpenAI said that it has notified affected users that their payment information may be exposed.
But this response is not satisfactory to all parties. For example, the Italian Personal Data Protection Agency pointed out that OpenAI did not inform about the collection and processing of user information, and lacked a legal basis for collecting and storing personal information.
OpenAI must be notified within 20 days, through its representatives in Europe, of the measures taken to implement the requirements of the Protection Office, otherwise it will be fined up to 20 million euros or 4% of the company's annual global turnover.
Now, with Samsung being exposed to a data breach due to ChatGPT, it has also triggered further discussions.
For example, many people do not have a strong sense of privacy protection when using ChatGPT.
And now as more and more enterprises use ChatGPT, the relevant usage rules also need to be further clarified. Will Microsoft products with built-in ChatGPT also be banned?
What do you think about this?
Reference Links:
[1]https://economist.co.kr/article/view/ecn202303300057
[2]https://n.news.naver.com/article/243/0000042639
[3]https://www.yna.co.kr/view/AKR20230401033500003
[4]https://www.cyberhaven.com/blog/4-2-of-workers-have-pasted-company-data-into-chatgpt/
[5]https://www.cnet.com/tech/services-and-software/chatgpt-bug-exposed-some-subscribers-payment-info/
[6]https://twitter.com/hirokinv/status/1641766639694913537