| 查看: 403 | 回复: 1 | |||
[交流]
英国谢菲尔德大学(QS Top100) 招收计算机博士, 10月底申请截止!
|
|
We are recruiting a PhD student to develop new algorithms for reinforcement learning from human feedback (RLHF), to effectively solve complex reinforcement learning tasks without a predefined reward function. The primary goal of this project will be the development of a novel RLHF framework that can learn more complex behaviours while requiring significantly less interactive human feedback than current RLHF methods. The direction of this project is highly flexible, and the student will have the opportunity to explore related directions that match their research interests. We intend for this project to explore applications of the new RLHF framework, such as fine-tuning and aligning large language models (LLMs), and the use of human feedback in robotics. The project may also explore the use of LLMs as part of the RLHF framework itself, to generate and/or interpret natural language feedback. The specific applications and research directions will depend on the student's own interests. The preferred starting date for this position would be in February 2026, but this is very flexible. Supervisors: Dr. Bei Peng, Dr. Robert Loftin Application deadline: October 31, 2025 Requirements: 1. A Bachelor's or Master's degree in Computer Science, Mathematics, or related field. 2. Solid programming skills and mathematical background in machine learning/reinforcement learning. 3. Proficiency in programming languages such as Python and familiarity with common deep learning and machine learning frameworks. 4. Good English communication skills, with an IELTS score of 6.5 or above (with no less than 6.0 in each component). Scholarship information: For UK home students, this is a fully funded 3.5-year PhD studentship. For international students, you will need to pay the difference between the UK and overseas tuition fees by securing additional funding or self-funding (i.e., the PhD studentship will cover tuition fees but not living expenses). More information and instructions for how to apply can be found here: https://www.findaphd.com/phds/project/improving-deep-reinforcement-learning-through-interactive-human-feedback/?p186459 (When applying, make sure you name Dr. Bei Peng and Dr. Robert Loftin as your proposed supervisors.) If you have any questions regarding the position, feel free to contact Dr. Bei Peng (bei.peng@sheffield.ac.uk) |
» 猜你喜欢
Bioresource Technology期刊,第一次返修的时候被退回好几次了
已经有6人回复
2025冷门绝学什么时候出结果
已经有4人回复
真诚求助:手里的省社科项目结项要求主持人一篇中文核心,有什么渠道能发核心吗
已经有8人回复
寻求一种能扛住强氧化性腐蚀性的容器密封件
已经有5人回复
论文投稿,期刊推荐
已经有6人回复
请问哪里可以有青B申请的本子可以借鉴一下。
已经有4人回复
孩子确诊有中度注意力缺陷
已经有14人回复
请问下大家为什么这个铃木偶联几乎不反应呢
已经有5人回复
请问有评职称,把科研教学业绩算分排序的高校吗
已经有5人回复
天津工业大学郑柳春团队欢迎化学化工、高分子化学或有机合成方向的博士生和硕士生加入
已经有4人回复
2楼2025-10-10 14:39:01













回复此楼