skip to content

The purpose of the credit module is to form students' systematic scientific worldview, general cultural outlook and competencies to identify, pose and solve research problems in the field of computer science, evaluate and ensure the quality of research performed. In particular, to master both the fundamental principles of the theory of step-by-step decision-making (the theory of Markov decision-making processes) and dynamic programming, and to be able to apply the obtained theoretical knowledge for solving applied, in particular, problems of optimal decision-making in industry (technical support of industrial systems, industrial safety examination system); robotics (automated forecasting); business (marketing, inventory management); computer science (troubleshooting networks, optimizing requests to distributed database servers); state security and military sciences (search for moving targets, target identification, distribution of weapons); health care (medical diagnostics, development of treatment protocols). Students must master the following competencies: general - GC 1 Ability to apply knowledge in practical situations; GC 3 Ability to think abstractly, apply methods of analysis and synthesis; GC 7Ability to search, process and analyze information from various sources; GC 11 Ability to generate new ideas (creativity); GC 12 Ability to work in a team and autonomously execute team decisions; professional – FC 1 Ability to use system analysis as a modern interdisciplinary methodology based on examples of mathematical methods and modern information technologies, and focused on solving problems of analysis and synthesis of technical, economic, social, environmental and other complex systems; FC 6 Ability to computer implementation of mathematical models of real systems and processes; design, apply and maintain simulation software, decision-making, optimization, information processing, data mining; FC 7 Ability to use modern information technologies for computer implementation of mathematical models and forecasting the behavior of specific systems, namely: object-oriented approach in the design of complex systems of various nature, applied mathematical packages, the use of databases and knowledge; FC 10 Ability to design experimental and observational studies and analyze the data obtained from them. Upon completion of the course, students should acquire the following program learning outcomes: PRN 9 Be able to create effective algorithms for computational problems of system analysis and decision support systems; PRN 12 Apply methods and means of working with data and knowledge, methods of mathematical, logical-semantic, object and simulation modeling, technologies of system and static analysis; PRN 14 Understand and apply in practice the methods of static modeling and forecasting, evaluate the initial data; PRN 17 Preserve and multiply the achievements and values of society based on an understanding of the place of the subject area in the general system of knowledge. Subject of study. Tasks and classes of reinforcement learning methods are just like the area of knowledge that includes the tasks of step-by-step optimal decision-making with partial observations The main tasks of the credit module. According to the requirements of the program of the discipline, postgraduate students after mastering the credit module must demonstrate the following learning outcomes: Knowledge: methods and means of reinforcement learning. Skills: solve real-world problems using reinforcement learning methods and algorithms. In particular, to formalize the problem of step-by-step optimal decision-making as a partially observable Markov decision-making process with possibly unknown transient probabilities and rewards, to apply modern algorithms for approximate solution of such problems, the ability to use relevant information technologies and create their own software products to solve real problems making optimal decisions in industry (technical support of industrial systems, industrial safety examination system); robotics (automated forecasting); business (marketing, inventory management); computer science (troubleshooting networks, optimizing requests to distributed database servers); state security and military sciences (search for moving targets, target identification, distribution of weapons); health care (medical diagnostics, development of treatment protocols). Experience: creation of a research laboratory for reinforcement learning (a paradigm of organized collaboration based on the experience of leading national laboratories in the United States), where the role of each team member is to specialize in a particular task in order to become the best at it, while having a holistic view of the entire process.