AI programming tools may mean rethinking compsci education • The Register

Analysis As the legal and ethical implications of assistive AI models like GitHub’s co-pilot continue to be resolved, IT professionals continue to find uses for large language models and urge educators to adapt.

Brett A. Becker, assistant professor at University College Dublin in Ireland, provided The register with preprint copies of two research papers exploring the risks and educational opportunities of AI tools for generating programming code.

Papers have been accepted to the 2023 SIGCSE Technical Symposium on Computer Science Education, to be held March 15-18 in Toronto, Canada.

In June, GitHub Copilot, a machine learning tool that automatically suggests programming code in response to pop-up prompts, came out of a year-long technical preview, as did concerns about how its OpenAI Codex model was formed and the implications of AI models for society merged. in focused opposition.

Beyond unresolved copyright and software licensing issues, other computer scientists, such as University of Massachusetts Amherst computer science professor Emery Berger, have sounded the alarm about the need to reevaluate computer pedagogy in light of the expected proliferation and improvement of automated support tools.

In “Programming Is Hard – Or At Least It Was: Educational Opportunities and Challenges of Generating AI Code” [PDF]Becker and co-authors Paul Denny (University of Auckland, Australia), James Finnie-Ansley (University of Auckland), Andrew Luxton-Reilly (University of Auckland), James Prather (Abilene Christian University, USA) and Eddie Antonio Santos (University College Dublin) argue that the education community must address the immediate opportunities and challenges presented by AI-based code generation tools.

They say it’s safe to assume that computer science students are already using these tools to do programming work. Therefore, policies and practices that reflect the new reality must be developed as soon as possible.

“Our view is that these tools have the potential to change the way programming is taught and learned – potentially significantly – in the short term, and that they present multiple opportunities and challenges that warrant immediate discussion as we we adapt to using these proliferating tools,” the researchers state in their paper.

These tools are expected to change the way programming is taught and learned – potentially significantly – in the near term.

The paper examines several of the helper programming models currently available, including GitHub Copilot, DeepMind AlphaCode, and Amazon CodeWhisperer, as well as lesser-known tools such as Kite, Tabnine, Code4Me, and FauxPilot.

Observing that these tools are moderately competitive with human programmers – for example, AlphaCode ranks in the top 54% of the top 5,000 developers participating in Codeforces programming competitions – boffins claim that AI tools can help students in different ways. This includes generating exemplary solutions to help students verify their work, generating variant solutions to broaden student understanding of problems, and improving the quality and style of student code.

The authors also see benefits for educators, who could use assistive tools to generate better exercises for students, to generate code explanations, and to provide students with more illustrative examples of programming constructs.

In addition to potential opportunities, there are challenges that teachers must overcome. These problem-solving and code-issuing tools could help students cheat more easily on homework; the private nature of using AI tools partly reduces the risk of hiring a third party to do the homework.

We could add that the quality of the source emitted by automated AI tools is sometimes poor, which could cause novice programmers to pick up bad habits and write insecure or fragile code.

The researchers observed that the way we approach attribution – central to the definition of plagiarism – may need to be revised as support options can provide varying degrees of support, making it difficult to distinguish between support authorized and excessive assistance.

“In other contexts, we use spell-checkers, grammar-checking tools that suggest rephrasing, predictive text and auto-reply email suggestions — all machine-generated,” the paper reminds us. “In a programming context, most development environments support code completion that suggests machine-generated code.

We use spell checkers, grammar checking tools that suggest reformulations…

“Distinguishing between different forms of machine suggestions can be difficult for academics, and it is unclear whether we can reasonably expect introductory programming students who are unfamiliar with tool support distinguish between different forms of machine-generated code suggestions.”

The authors say this raises a key philosophical question: “How much content can be generated by a machine while still attributing the intellectual property to a human?”

They also point out how AI models do not meet attribution requirements set out in software licenses and fail to address ethical and environmental concerns about the energy used to create them.

The pros and cons of AI tools in education need to be addressed, the researchers conclude, or educators will lose the ability to influence the evolution of this technology.

And they have little doubt that he is here to stay. The second article, “Using large language models to improve programming error messages”, [PDF] offers an example of the potential value of large language models like the Open AI Codex, the basis of Copilot.

Authors Juho Leinonen (Aalto University), Arto Hellas (Aalto University), Sami Sarsa (Aalto University), Brent Reeves (Abilene Christian University), Paul Denny (University of Auckland), James Prather (Abilene Christian University) and Becker postulated Codex to generally cryptic computer error messages and found that the AI ​​model can make errors easier to understand, by offering a plain English description – which benefits both teachers and students.

“Large language models can be used to create useful, novice-friendly enhancements to programming error messages that sometimes exceed the original programming error messages in interpretability and action,” say the boffins in their article.

For example, Python may issue the error message: “Syntax error: Unexpected EOF while parsing.” Codex, given the context of the code involved and the error, would suggest this description to help the developer: “The error is caused by the block of code expecting another line of code after the colon. To resolve the problem, I would add another line of code after the colon.”

However, the results of this study say more about the promise than the actual usefulness. The researchers fed the broken Python code and corresponding error messages into the Codex model to generate problem explanations, and evaluated these descriptions for: understandability; unnecessary content; have an explanation; have a correct explanation; have a fix; the accuracy of the correction; and the added value of the original code.

The results vary considerably from category to category. Most were understandable and contained an explanation, but the model offered correct explanations for some errors with much more success than others. For example, the error “Failed to assign a function call” was explained correctly 83% of the time, while “Unexpected EOF]while parsing” was only explained correctly 11% of the time. And the average global error message fix was only correct 33% of the time.

“Overall, the reviewers felt that the content created by Codex, i.e. the explanation of the error message and the proposed fix, was an improvement over the error message of origin in a little more than half of the cases (54%)”, indicates the document.

The researchers conclude that although the programming error message explanations and suggested fixes generated by the large language models are not yet ready for production use and may mislead students, they believe that the programming error models AI could become adept at dealing with code errors with additional work.

Expect this work to keep the tech industry, academia, government, and other interested parties busy for years to come. ®

Sam D. Gomez