Saturday, July 6, 2024

PaLM based Moderation Improves AI and Online Community Trust

Text Moderation

We are delighted to unveil Text Moderation driven by PaLM 2, which will be accessible via the Cloud Natural Language API. This feature will provide developers the ability to recognize sensitive material in an environment where media is constantly evolving. Text Moderation was developed in conjunction with Jigsaw and Google Research to assist businesses in screening text for potentially sensitive or hazardous material. The following are some applications that illustrate how the Text Moderation service may be utilized:

  • Brand Safety: Protect your brand by avoiding user-generated material and publisher content that may not be “brand safe” for the company you work for. 
  • User protection: Protect the users by looking for material that might be considered objectionable or dangerous.
  • Generative AI risk mitigation: Help protect against the creation of improper material in outputs from generative models as part of risk mitigation efforts for generative artificial intelligence.

Foster the protection of the brand

In today’s hyper-connected world, protecting a company’s good name and reputation for reliability requires a specific set of practices known as “brand safety.” If an ad appears on a website that contains content that does not conform with the values of the sponsoring brand, it can reflect poorly on the brand and organization, so it is important for businesses to identify and remove content that isn’t aligned with brand guidelines or consistent with the brand. One of the biggest risks to brand safety is the content that ads are associated with.

Text Moderation is a tool that our clients may use to identify information that they believe is hurtful or objectionable, sensitive in context, or otherwise unsuitable for their brand. Once an organization has identified this content, teams are able to take action to remove it from advertising campaigns or prevent it from being associated with the brand in the future. This helps ensure that advertising campaigns are effective and that the brand is associated with content that is positive and trustworthy.

Users should be protected from potentially hazardous material

User-generated content poses unique challenges for digital media platforms, game publishers, and online markets, all of which have a financial incentive to reduce those challenges. They strive to create a place that is secure and friendly for their users while yet allowing for an open and unrestricted discussion of different points of view.

Text Moderation may assist them in accomplishing this objective by using artificial neural networks to identify potentially harmful material and removing it. This content may include harassment or abuse. These initiatives may assist decrease damage, enhance the experience of customers, and boost the rate at which customers continue to use the service.

Reduce the dangers posed by generative models

In the last year, advancements in artificial intelligence have made it possible for software to produce text, photos, and video with greater accuracy. This has led to the development of new businesses and services that employ machine learning, such as text generators, to create content. On the other hand, every kind of AI content production carries with it the possibility of generating content that is objectionable, even if this occurs by accident.

In order to mitigate the impact of this threat, we have trained and tested the Text Moderation service using actual prompts and replies generated by large scale generative models. Text Moderation is an effective method for safeguarding users from potentially dangerous information because of its adaptability and its coverage of a wide variety of content kinds.

How to get started with Text Modification by using the Natural Language API

Text Moderation is driven by Google’s most recent PaLM 2 foundation model, which enables it to recognize a broad variety of potentially harmful material, such as harassment of a sexual nature, bullying, and hate speech. The application programming interface (API) may be accessed from almost any programming language and can yield confidence ratings for a total of sixteen distinct “safety attributes.” The API is simple to use and can be integrated with already existing systems.

News source:

RELATED ARTICLES

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Recent Posts

Popular Post

Govindhtech.com Would you like to receive notifications on latest updates? No Yes