Is it safe to use generative AI in the classroom?

First: who are we to say?

As artificial intelligence has made its way into the public consciousness, the field’s vast technical challenges have become matched by its equally formidable moral quandaries. One of the first facets of life with which AI has begun to integrate is, perhaps surprisingly, a delicate one: education.

How will AI reshape learning? Can it do so without compromising the privacy of students or the integrity of their education? Will AI increase rates of cheating? And, perhaps most crucially, will it support or supplant the invaluable role of teachers?

Merlyn Mind is fortunate to have been able to carefully consider questions like these while we’ve built an AI solution from the ground up. The first enterprise application of our proprietary model was a powerful AI assistant for teachers.

What sets Merlyn apart is that we haven’t attempted to retrospectively reskin a general-purpose AI for classroom use. Instead, over the past five years, our researchers have created a brand-new AI with safety, context, and compliance in mind.

From left, Babak Mostaghimi (Gwinnet County Public Schools), Sharad Sundararajan (Merlyn Mind), Sherry Lachman (OpenAI), and Susan Athey (Stanford Graduate School of Business) in advance of their panel.

The 2023 Stanford Edtech Impact Summit

Last week, leaders in edtech were invited to Stanford for a symposium on the future of education. Merlyn Mind co-founder and Chief Data & Information Officer Sharad Sundararajan was invited to speak about responsible use of AI in edtech.

As the panel veered into topics like AI hallucinations, misinformation, and content appropriateness, the panelists — technologists and educators alike — maintained the position that AI is here to stay. “Pandora’s box has been opened,” an attendee stated at a later seminar. The question of the day became: What can be done now?

The panel’s moderator, Susan Athey of the Stanford Graduate School for Business, emphasized Sharad’s “unique vantage point” in the industry. “I’d love to hear about what you’re doing at Merlyn Mind … building AI models and using LLMs as solutions in education.”

Sharad opened his remarks by firmly addressing the topic of responsible AI use. He emphasized Merlyn’s mission to create domain-specific, hallucination-resistant AI: “First of all, we strongly believe AI should be purpose-built and beneficial. We've been very intentional about context and safety."

He then took a strong stance against the notion that human teachers might be replaced: “At Merlyn Mind, we’re building a co-pilot for educators, with educators, and an AI platform to power it. We work with teachers in 33 states, and we’ve researched their pain points and primary use cases.” Sharad’s remarks consistently echoed the sentiment, “AI should complement, not compete.”

Latha Ramanan, Merlyn’s SVP of Product Growth, in attendance at the conference.

Where do we stand, and where do we go?

Sharad was then asked a hard-hitting question, “What do you think AI models [in education] are currently best at, and what are they not so good at?” The inquiry touched on not only a necessary clarification, but also brought to light an unavoidable truth: the state of AI is nowhere near perfection.

Sharad listed some strengths of current LLMs: summarization, code generation, knowledge synthesis, and content moderation. He referenced some weaknesses: reasoning, particularly in areas like mathematics, but also including common sense. Sycophancy, the over-compliance of models with user preferences. A lack of specialized LLMs. And, perhaps above all, the “trust barrier,” the idea that AI will have to become incredibly consistent — which most are not — before mainstream adoption.

He also distinguished general-purpose AI from purpose-built AI. “We're kind of moving away from larger models, because it's important to get the right model size for the right task, which means we need to understand what these tasks are in education. Looking at [education], you might say you want your content aligned with standards. So, we have a model for that.” On that point, Sharad discussed Toolformers, describing them as, “Given an utterance, we want our product to identify which tool to invoke, which agent to invoke, and which action to take. Given a dialogue, one-on-one or one-to-many, we want our product to ask itself, ‘How do I figure out the next pedagogical, meaningful move, be that an activity, a game, a question, an assessment?’”

In his final point on AI risks and degenerative AI, Sharad expanded the purview of his communicable concerns beyond the technical. He referenced big tech dominance and the worry that students could become guinea pigs, saying, "Technology has transformative power. But for it to be a force multiplier, we must co-create, collaborate [with the involved communities].”

Merlyn Mind is passionately dedicated to addressing, understanding, and solving the abstract dilemmas posed by the rise of artificial intelligence. As we continue to participate in events like Stanford’s recent conference, we consistently expand and reshape our understanding of our position in society’s future. After the conference, we hosted a private dinner for thought leaders in the space, where we spent hours approaching the issue of responsible AI from all angles.

You can read about that dinner in our next blog post. If you’re interested in building the future with Merlyn Mind, you can check out our open job postings.

All Posts