个人信息

参与实验室科研项目
人机系统中人与机器的自主性边界及其切换策略研究
研究课题
面向新型人机混合智能系统,考虑如何平衡机器智能和人类智能在决策层面的融合问题。基于对自主性边界的讨论探究,进而实现对现有人机“介入”控制系统(Human-On-The-Loop)和人机“共享”控制系统(Human-In-The-Loop)的优化设计。
学术成果
共撰写/参与撰写专利 0 项,录用/发表论文 4 篇,投出待录用论文0篇。学术成果部分从赵云波教授个人维护的bib文件自动生成,只包含其共同署名的论文/专利(联合培养或代为指导学生可能有未署名论文/专利,不会在此展示),会因为更新不及时而缺失部分论文/专利,如有缺失请及时与老师联系添加更新。
Journal Articles
-
面向人机序贯决策实现共享控制下的仲裁优化
张倩倩,
赵云波,
吕文君,
and 陈谋
中国科学:信息科学
2023
[doi]
[pdf]
-
Traded Control of Human–Machine Systems for Sequential Decision-Making Based on Reinforcement Learning
Qianqian Zhang,
Yu Kang,
Yun-Bo Zhao ,
Pengfei Li,
and Shiyi You
IEEE Trans. Artif. Intell.
2022
[doi]
[pdf]
Conference Articles
-
Adaptive Arbitration for Minimal Intervention Shared Control via Deep Reinforcement Learning
Shiyi You,
Yu Kang,
Yun-Bo Zhao ,
and Qianqian Zhang
In 2021 China Autom. Congr. CAC
2021
[Abs]
[doi]
[pdf]
In shared control, humans and intelligent robots jointly complete real-time control tasks with their complementary capabilities for improved performance unavailable by neither side on its own, which is attracting more and more attentions in recent years. Arbitration, as an indispensable part of shared control, determines how control authority is allocated between the human and robot, and the definition of that policy has always been one of the fundamental problems. In this paper, we propose an adaptive arbitration method for shared control systems, which minimizes the deviation from the human inputs while ensuring the system performance based on deep reinforcement learning. We provide humans the maximum assistance with the minimal intervention, in order to balance human’s need for control authority and need for performance. We apply our method to real-time control tasks, and the results show that our method achieves high task success rate and shorter task completion time with less human workload, while maintaining higher human satisfaction.
-
Autonomous Boundary of Human-Machine Collaboration System Based on Reinforcement Learning
Qianqian Zhang,
Yun-Bo Zhao ,
and Yu Kang
In 2020 Aust. N. Z. Control Conf. ANZCC
2020
[Abs]
[doi]
[pdf]
This paper provides a human-machine collaborative control framework, including artificial intelligence decision systems, human-level control, arbiter judgment, and learning of autonomous boundary, so that human suggestions are incorporated into the training process of decisions, assisting agents to learn quickly control decision tasks. Based on the model-free deep reinforcement learning algorithm HITL-AC, the human feedback (reward or punishment) is connected with the reward of the agent, so that the agent continuously tries to find a better boundary during the system’s operation, avoiding defects of pre-fixed boundary. This formulation improves the data efficiency of reinforcement learning and plays a guiding role in seeking human intervention when the agent is in an uncertain environmental state during the test use phase. The fourth section of the paper gives a training demonstration of the bipedal walker. The experimental results show that human intervention can accelerate the process of agent reinforcement learning during the training phase, and seek human help when guiding the dangerous state of the agent during the test phase. This is beneficial for solving real-world problems, further proving the feasibility and effectiveness of the proposed framework and method.
学位论文
Theses
-
面向人机序贯决策的混合智能方法研究
张倩倩
中国科学技术大学, 合肥
2021
[Abs]
[pdf]
随着人工智能技术的发展,机器智能得到不断的提高,随之而来的则是机 器智能得以在各行各业应用发展。在此进程中,不可避免的会遇到机器自主性 不足以解决本身该由人类解决或者人类必须参与决策的情况,考虑此种场景下 人类智能和机器智能共同作用的决策问题则显得尤为重要和有意义。更具体地, 序贯决策问题作为一类具有时序性和多阶段性的动态决策问题,其发展与当下 人工智能时代下的工程应用、生产生活等领域息息相关。人的作用体现在序贯决 策问题的两方面,一则,人本身属于序贯决策问题模型中的一部分,即该类问题 是离不开人的如微创外科手术等;二则,人的相关信息不体现在序贯决策问题模 型中,而是因人独特的认知能力使得其可以出现在问题的求解办法中,达到改善 问题求解的目的如人对机器搜救系统的引导等,我们将上述两种场景统称为 “人 机序贯决策问题”。 针对人机序贯决策问题,由于人类智能和机器智能本质上的区别,数学表达 上的巨大差异,使得人和机器共同作用于问题求解时,不可避免的因为协调原因 造成决策质量不高甚至决策失误的现象。然而直接应用传统人机系统的控制算 法不能有效处理这些问题,从而引起机器代理失效,人力浪费,甚至还会造成决 策系统性能恶化甚至崩溃。因此,亟需设计有效的人机混合智能算法来解决这些 问题。本文以人机序贯决策问题为研究对象,围绕人机混合智能控制中的决策权 限划分、介入控制触发切换时机和共享控制混合人机决策动作程度三个问题展 开研究,旨在提出有效的人机混合智能算法来改善提升人机序贯决策问题的求 解。本文的研究工作主要包括以下几个方面: 1. 提出了基于强化学习方法的人机混合智能控制框架。通过将机器代理的决 策和人类的决策以可信性和安全性为评价指标进行仲裁选择,以确定更优 的待执行决策动作。同时考虑了基于模型的强化学习子系统和基于无模型 的强化学习子系统,为适应广泛的序贯决策应用场景提供了更多可能。 2. 针对人机序贯决策中的介入控制问题,提出了自主性及自主性边界的概念, 通过将自主性边界的求解形式化为与任务目标相关的常规优化问题进行讨 论判定,优化介入控制的控制方案和算法,实现人机序贯决策中人介入机 器场景和机器介入人场景下的决策性能提升。 3. 针对人机序贯决策中的共享控制问题,提出了基于自主性边界的混合参数 优化设计方案,通过自适应调节混合参数大小直接影响最终待执行动作的 生成。考虑了人机动作的融合程度,使得最优解在人的动作空间和机器的 动作空间所共同张成的扩展空间中出现,为决策质量的提升提供了扩展空间。 4. 针对介入控制和共享控制中所估计的自主性边界值可能存在单值估计不准 确的问题,提出了基于贝叶斯神经网络的不确定性估计办法,获得自主性 边界的概率分布信息并用于决策动作生成,利用自主性边界的不确定性优 化设计人机混合智能算法,既使得决策动作的优化存在更多选择,也更加 符合人们对决策边界的模糊性思考。 综上所述,本文面向人机序贯决策对混合智能算法所面临的问题进行了系 统性的研究,创新性地提出了对应的解决方案,推动了人机序贯决策求解和混合 智能算法的进一步发展。
毕业去向
安徽大学人工智能学院, 讲师