Robots have taken charge of numerous large-scale exams, including exploring ways to grade open-ended subjects like essays. This news was reported by Xu Diwei, along with intern Li Yan.
During this year's mid-term exams, Xiangyang City implemented an intelligent online marking system. You can watch the video on the Leiyang Radio and Television Station website (02:51).
For every major exam, marking is a critical task that demands both time and effort. As artificial intelligence continues to evolve, robotic marking technology has grown increasingly sophisticated in recent years.
Recently, officials from the University of Science and Technology informed Yu Xin News that, under the supervision of the Ministry of Education's examination center, intelligent marking technology has been applied in large-scale exams across many provinces in the country, such as university entrance exams, adult college entrance exams, and academic proficiency tests. These efforts have passed multiple multi-scale pilot verifications.
In the middle school entrance exam in Hubei Province in 2017, Xiangyang City was the first to adopt the smart assessment system. Liu Chaozhi, director of the Municipal Education Examination Institute, told reporters that, "Besides manual marking, intelligent marking offers advantages in speed and can address issues related to handling identical and blank volumes."
Multiple large-scale pilot verifications have been conducted on the exams.
In March 2016, the Ministry of Education's Examination Center and the University of Science and Technology launched a joint laboratory to collaborate on research in artificial intelligence technology, focusing on areas such as intelligent marking, question setting, and assessment evaluation.
HKUST News recently informed Xinhua News that, under the organization of the examination center, the university's all-disciplinary intelligent marking technology has been tested at various academic levels, including CET 4 & 6, as well as college entrance exams and provincial exams across the country. Large-scale exams like the adult college entrance exams have undergone extensive multi-scale pilot verifications.
The verification results show that the computer scoring results match those of on-site examiners and fully meet the needs of large-scale exams.
In the past, analyzing hundreds of thousands to millions of exam papers required vast human resources, making it impractical. However, with precise image recognition and massive text retrieval technologies, it's now possible to swiftly examine all exam papers and identify similar texts, as well as quickly extract and flag potentially problematic questions.
According to the "Fuyang Evening News," unlike the previous year's midterm exam markings, in 2017, the Fuyang City Entrance Exam in Hubei Province was the first to introduce an intelligent evaluation system. A technician at the reading site mentioned that the smart assessment system can perform workload analysis, list the total amount of each assessment source, and monitor the quality of each teacher's assessment.
Liu Chaozhi, Dean of Fuyang Education Examination Institute, stated that with smart data, the score of each question, the city's average score, which areas of knowledge students excel in, and where education falls short, can be analyzed. This provides educators with a diagnostic report, making teaching and learning more effective. "Compared to manual marking, intelligent marking not only speeds up the marking process but also addresses shortcomings in handling identical and blank volumes."
Gong Xun, a recruitment staff member from the Fuyang City Education Examination Institute, noted that the intelligent marking system can cover most models. After using the intelligent system, you can search through massive data to accurately determine if a model has been copied.
On July 19, Liu Chaozhi informed Xinhua News that more information would be disclosed in due time.
HKUST News informed Xinhua News that intelligent review uses deep textural recognition technology based on deep neural network learning, which has reached the level of recognizing Chinese and English handwritten characters. Applying this technology in formal exams can assist manual marking, reduce manpower input, minimize the impact of factors like fatigue and emotions in manual marking, and enhance the efficiency, accuracy, and fairness of manual marking scores, creating a significant shift in the entire industry.
Additionally, through this technology, the massive and accurate analysis data generated after all test-takers' exam papers were electronically processed also provides valuable materials for future research in teaching and learning. This opens up possibilities for breakthroughs in combining scoring with real classroom applications, such as better integration through intelligent scoring and marking.
"The fundamental goal is to introduce artificial intelligence to push the examination paper into the 3.0 era," President Wu Xiaoru of HKUST Flywheel Value told Xinhua News in June. "The 1.0 era is pen-and-paper review. In the 2.0 era, people organize on the internet to automatically review some objective questions using machines. In the AI era, subjective questions can already be automatically reviewed."
It's no longer a dream for machines to automatically review subjective questions.
General exams typically consist of two parts: objective questions and subjective questions. With the use of answer sheets and scanners, objective questions can all be reviewed by machines. Not only does this increase the speed of grading, but it also improves accuracy.
Since the 1960s, many foreign experts and scholars have dedicated themselves to researching machine-reading technology for subjective exams. Various automated correction systems have emerged, such as the E-rater system used in U.S. MBA and TOEFL exams. However, most of these systems target second-language compositions, i.e., non-native language compositions. Reviewing native language compositions requires evaluating at a higher level, such as the literary writing of compositions, text cohesion, and the concept of the composition itself.
In November 2015, the University of Science and Technology's machine intelligence marking technology was successfully applied in pilot projects in Anqing and Hefei. After analyzing the results of human-computer ratings, the computer reached or surpassed the人工 scoring level in terms of score agreement rate, average score difference, correlation degree, and the proportion closer to the arbitration score. This means that machine reviews of subjective questions are no longer far-fetched.
So, what is the principle and basis for machine reviews of subjective questions without objective criteria? Wu Xiaoru explained that the essential difference between machine marking and manual marking lies in their working mechanisms. Machines make decisions through statistics, reasoning, and judgment, which differs from human thought processes. During the marking process, the machine adopts intelligent learning. After a group of experts typically reviews about 500 to 1000 papers, the machine can learn the review mode of these papers and form a model. This model can effectively handle and cover other papers, allowing the machine to automatically review other papers according to the model.
For metrics, a group of highly literate experts is first selected, and the average score given by the group of experts for a set of exam papers is used as a relative standard. Afterward, the final test results of the machine and the results of other testers' tests are compared with the average score of the experts. If the machine and the expert give an average score that is closer and more relevant, it is considered that the results of the machine review are promising.
"Only a simple or standardized test pattern is actually very easy to cheat, but from the results of many current applications, there is no one way to deceive the machine well," Wu Xiaoru said. "Like AlphaGo, Go doesn't mean you can defeat it by finding an objective and standard routine."
Moreover, Wu Xiaoru mentioned that the machine will flag unique and creative test papers for manual review. Additionally, test papers with low-level errors but innovative ideas leading to poor results also need to be judged by on-site testers and experts.
Wu Xiaoru said that in fact, the subjective review of the machine has been verified for a long time. "Many educational experts, frontline teachers, and principals initially disagreed with machine grading, but through on-site comparisons of the results, these experts eventually recognized that the machine performs better than manual testing."
Exploring automatic composition scores
In recent years, the most important aspect of the research on subjective machine scoring technology has been the language composition scoring technology developed by the HIT-Xinfei Joint Laboratory.
To score an essay, you need to have a very strong understanding of the text. What dimensions should the machine evaluate? How can these dimensions be quantified?
According to the researchers, just as teachers in the country scored with a set of uniform and rigorous standards in the Chinese and college entrance exams, machines reviewed compositions. The most important thing was for the machine to learn the standard and then apply the standard.
That is, teachers first establish a common set of solutions for assessing the quality of a composition, from aspects such as handwriting neatness, vocabulary richness, sentence fluency, literary acquisition, textual structure, and conception. Afterwards, the machine can use algorithms to learn from a small number of artificially scored samples to obtain the composition scoring criteria. For instance, there are 2,000 papers in an exam. Starting from the first paper, the machine can learn the teacher's marking method, and when it learns 200 copies, the machine can replace manual work and automatically score the remaining papers intelligently.
In the composition scoring system, vocabulary richness and conception belong to content-related features; verbosity, partial coherence, syntactic correctness, and textual structure belong to expression-related features; additionally, the technology uses artificial neural networks to deeply express the semantics of the composition, thereby grasping the concept of the article from a macro perspective.
Each of these standards requires sophisticated and sophisticated technology to support it. For example, handwriting recognition technology is needed to determine the degree of handwriting recognition. That is, handwriting characters in a picture are automatically converted into text and the recognition probability is given to indicate the degree of neatness. For another example, to determine whether an essay is off-topic, it is necessary to first extract keywords based on the topic content, expand according to the topics, and at the same time extract the keywords in the essay, and then calculate the similarity between the keyword of the composition and the keyword of the title. In addition, the topic model can also be trained on the large-scale data of the exam to obtain a global topic distribution, and then compare it with the topic distribution of the composition being evaluated.
The University of Science and Technology of China, which participated in the National "863 Program" (National High-Tech Research and Development Program), stated that with the development of artificial intelligence technology, in the future, even political, historical, and geographical subject questions can be automatically checked by machines.
After automatic machine reading becomes a reality, teachers will have more time and energy to invest in research on innovative teaching methods, providing students with higher quality and more comprehensive education.
Single Axis Solar Tracker System
Single Axis Solar Tracker System price,Sun System Powered One Axis,One Axis Solar Tracker Single Axis,Mounting System Solar Tracker
Hebei Jinbiao Construction Materials Tech Corp., Ltd. , https://www.pvcarportsystem.com