Resources

Microsoft Needs So Much Power to Train AI That It’s Considering Small Nuclear Reactors

Training large language models is an incredibly power-intensive process that has an immense carbon footprint. Keeping data centers running requires a ludicrous amount of electricity that could generate substantial amounts of greenhouse emissions — depending, of course, on the energy’s source. Now, the Verge reports, Microsoft is betting so big on AI that its pushing forward with a plan to power them using nuclear reactors. Yes, you read that right; a recent job listing suggests the company is planning to grow its energy infrastructure with the use of small modular reactors (SMR)…

But before Microsoft can start relying on nuclear power to train its AIs, it’ll have plenty of other hurdles to overcome. For one, it’ll have to source a working SMR design. Then, it’ll have to figure out how to get its hands on a highly enriched uranium fuel that these small reactors typically require, as The Verge points out. Finally, it’ll need to figure out a way to store all of that nuclear waste long term…

Other than nuclear fission, Microsoft is also investing in nuclear fusion, a far more ambitious endeavor, given the many decades of research that have yet to lead to a practical power system. Nevertheless, the company signed a power purchase agreement with Helion, a fusion startup founded by OpenAI CEO Sam Altman earlier this year, with the hopes of buying electricity from it as soon as 2028.

83

Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio

Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize audio of that person saying anything — and do it in a way that attempts to preserve the speaker’s emotional tone. Its creators speculate that VALL-E could be used for high-quality text-to-speech applications, speech editing where a recording of a person could be edited and changed from a text transcript (making them say something they originally didn’t), and audio content creation when combined with other generative AI models like GPT-3.

Microsoft calls VALL-E a “neural codec language model,” and it builds off of a technology called EnCodec, which Meta announced in October 2022. Unlike other text-to-speech methods that typically synthesize speech by manipulating waveforms, VALL-E generates discrete audio codec codes from text and acoustic prompts. It basically analyzes how a person sounds, breaks that information into discrete components (called “tokens”) thanks to EnCodec, and uses training data to match what it “knows” about how that voice would sound if it spoke other phrases outside of the three-second sample. Or, as Microsoft puts it in the VALL-E paper (PDF): “To synthesize personalized speech (e.g., zero-shot TTS), VALL-E generates the corresponding acoustic tokens conditioned on the acoustic tokens of the 3-second enrolled recording and the phoneme prompt, which constrain the speaker and content information respectively. Finally, the generated acoustic tokens are used to synthesize the final waveform with the corresponding neural codec decoder.”

[…] While using VALL-E to generate those results, the researchers only fed the three-second “Speaker Prompt” sample and a text string (what they wanted the voice to say) into VALL-E. So compare the “Ground Truth” sample to the “VALL-E” sample. In some cases, the two samples are very close. Some VALL-E results seem computer-generated, but others could potentially be mistaken for a human’s speech, which is the goal of the model. In addition to preserving a speaker’s vocal timbre and emotional tone, VALL-E can also imitate the “acoustic environment” of the sample audio. For example, if the sample came from a telephone call, the audio output will simulate the acoustic and frequency properties of a telephone call in its synthesized output (that’s a fancy way of saying it will sound like a telephone call, too). And Microsoft’s samples (in the “Synthesis of Diversity” section) demonstrate that VALL-E can generate variations in voice tone by changing the random seed used in the generation process.

Microsoft has not provided VALL-E code for others to experiment with, likely to avoid fueling misinformation and deception.

135

Microsoft Also Patented Tech to Score Meetings Using Filmed Body Language, Facial Expressions

Newly surfaced Microsoft patent filings describe a system for deriving and predicting “overall quality scores” for meetings using data such as body language, facial expressions, room temperature, time of day, and number of people in the meeting. The system uses cameras, sensors, and software tools to determine, for example, “how much a participant contributes to a meeting vs performing other tasks (e.g., texting, checking email, browsing the Internet).”

The “meeting insight computing system” would then predict the likelihood that a group will hold a high-quality meeting. It would flag potential challenges when an organizer is setting the meeting up, and recommend alternative venues, times, or people to include in the meeting, for example… A patent application made public Nov. 12 notes, “many organizations are plagued by overly long, poorly attended, and recurring meetings that could be modified and/or avoided if more information regarding meeting quality was available.” The approach would apply to in-person and virtual meetings, and hybrids of the two…

The filings do not detail any potential privacy safeguards. A Microsoft spokesperson declined to comment on the patent filings in response to GeekWire’s inquiry. To be sure, patents are not products, and there’s no sign yet that Microsoft plans to roll out this hypothetical system. Microsoft has established an internal artificial intelligence ethics office and a companywide committee to ensure that its AI products live by its principles of responsible AI, including transparency and privacy. However, the filings are a window into the ideas floating around inside Microsoft, and they’re consistent with the direction the company is already heading.

449

Skype Audio Graded by Workers in China With ‘No Security Measures’

A Microsoft program to transcribe and vet audio from Skype and Cortana, its voice assistant, ran for years with “no security measures,” according to a former contractor who says he reviewed thousands of potentially sensitive recordings on his personal laptop from his home in Beijing over the two years he worked for the company.

The recordings, both deliberate and accidentally invoked activations of the voice assistant, as well as some Skype phone calls, were simply accessed by Microsoft workers through a web app running in Google’s Chrome browser, on their personal laptops, over the Chinese internet, according to the contractor. Workers had no cybersecurity help to protect the data from criminal or state interference, and were even instructed to do the work using new Microsoft accounts all with the same password, for ease of management, the former contractor said. Employee vetting was practically nonexistent, he added.

“There were no security measures, I don’t even remember them doing proper KYC [know your customer] on me. I think they just took my Chinese bank account details,” he told the Guardian. While the grader began by working in an office, he said the contractor that employed him “after a while allowed me to do it from home in Beijing. I judged British English (because I’m British), so I listened to people who had their Microsoft device set to British English, and I had access to all of this from my home laptop with a simple username and password login.” Both username and password were emailed to new contractors in plaintext, he said, with the former following a simple schema and the latter being the same for every employee who joined in any given year.

548

Microsoft Turned Down Facial-Recognition Sales over “Human Rights Concerns”

Microsoft recently rejected a California law enforcement agency’s request to install facial recognition technology in officers’ cars and body cameras due to human rights concerns, company President Brad Smith said on Tuesday. Microsoft concluded it would lead to innocent women and minorities being disproportionately held for questioning because the artificial intelligence has been trained on mostly white and male pictures. AI has more cases of mistaken identity with women and minorities, multiple research projects have found.

Smith explained the decisions as part of a commitment to human rights that he said was increasingly critical as rapid technological advances empower governments to conduct blanket surveillance, deploy autonomous weapons and take other steps that might prove impossible to reverse. Smith also said at a Stanford University conference that Microsoft had declined a deal to install facial recognition on cameras blanketing the capital city of an unnamed country that the nonprofit Freedom House had deemed not free. Smith said it would have suppressed freedom of assembly there.

On the other hand, Microsoft did agree to provide the technology to an American prison, after the company concluded that the environment would be limited and that it would improve safety inside the unnamed institution.

685

Dutch Government Report Says Microsoft Office Telemetry Collection Breaks EU GDPR Laws

Microsoft broke Euro privacy rules by carrying out the “large scale and covert” gathering of private data through its Office apps, according to a report commissioned by the Dutch government.

It was found that Microsoft was collecting telemetry and other content from its Office applications, including email titles and sentences where translation or spellchecker was used, and secretly storing the data on systems in the United States.

Those actions break Europe’s new GDPR privacy safeguards, it is claimed, and may put Microsoft on the hook for potentially tens of millions of dollars in fines. The Dutch authorities are working with the corporation to fix the situation, and are using the threat of a fine as a stick to make it happen.

The investigation was jumpstarted by the fact that Microsoft doesn’t publicly reveal what information it gathers on users and doesn’t provide an option for turning off diagnostic and telemetry data sent by its Office software to the company as a way of monitoring how well it is functioning and identifying any software issues.

746

India’s Biometric Database Is Creating A Perfect Surveillance State — And U.S. Tech Companies Are On Board

Big U.S. technology companies are involved in the construction of one of the most intrusive citizen surveillance programs in history. For the past nine years, India has been building the world’s biggest biometric database by collecting the fingerprints, iris scans and photos of nearly 1.3 billion people. For U.S. tech companies like Microsoft, Amazon and Facebook, the project, called Aadhaar (which means “proof” or “basis” in Hindi), could be a gold mine. The CEO of Microsoft has repeatedly praised the project, and local media have carried frequent reports on consultations between the Indian government and senior executives from companies like Apple and Google (in addition to South Korean-based Samsung) on how to make tech products Aadhaar-enabled. But when reporters of HuffPost and HuffPost India asked these companies in the past weeks to confirm they were integrating Aadhaar into their products, only one company — Google — gave a definitive response.

That’s because Aadhaar has become deeply controversial, and the subject of a major Supreme Court of India case that will decide the future of the program as early as this month. Launched nine years ago as a simple and revolutionary way to streamline access to welfare programs for India’s poor, the database has become Indians’ gateway to nearly any type of service — from food stamps to a passport or a cell phone connection. Practical errors in the system have caused millions of poor Indians to lose out on aid. And the exponential growth of the project has sparked concerns among security researchers and academics that India is the first step toward setting up a surveillance society to rival China.

699

Facebook, Google, and Microsoft Use Design to Trick You Into Handing Over Your Data, New Report Warns

A study from the Norwegian Consumer Council dug into the underhanded tactics used by Microsoft, Facebook, and Google to collect user data. “The findings include privacy intrusive default settings, misleading wording, giving users an illusion of control, hiding away privacy-friendly choices, take-it-or-leave-it choices, and choice architectures where choosing the privacy friendly option requires more effort for the users,” states the report, which includes images and examples of confusing design choices and strangely worded statements involving the collection and use of personal data.

Google makes opting out of personalized ads more of a chore than it needs to be and uses multiple pages of text, unclear design language, and, as described by the report, “hidden defaults” to push users toward the company’s desired action. “If the user tried to turn the setting off, a popup window appeared explaining what happens if Ads Personalization is turned off, and asked users to reaffirm their choice,” the report explained. “There was no explanation about the possible benefits of turning off Ads Personalization, or negative sides of leaving it turned on.” Those who wish to completely avoid personalized ads must traverse multiple menus, making that “I agree” option seem like the lesser of two evils.

In Windows 10, if a user wants to opt out of “tailored experiences with diagnostic data,” they have to click a dimmed lightbulb, while the symbol for opting in is a brightly shining bulb, says the report.

Another example has to do with Facebook. The social media site makes the “Agree and continue” option much more appealing and less intimidating than the grey “Manage Data Settings” option. The report says the company-suggested option is the easiest to use. “This ‘easy road’ consisted of four clicks to get through the process, which entailed accepting personalized ads from third parties and the use of face recognition. In contrast, users who wanted to limit data collection and use had to go through 13 clicks.”

709