Jae-Kyung Cho LLM Developers who was a Robotics engineer

Archive

This is a personal space where I review papers and organize related documents.
이 곳은 개인적으로 논문을 리뷰하고 관련 문헌들을 정리하는 공간입니다.

전체 목록


24 Mar 2024 » Direct Alignment from Preferences - Part 03. Online DAP

25 Feb 2024 » Direct Alignment from Preferences - Part 02. DAP

22 Feb 2024 » Diary - SSH keygen 이슈

19 Feb 2024 » Direct Alignment from Preferences - Part 01. RLHF

01 Jan 2024 » Diary - 2023년 회고

21 Dec 2023 » MLX: Apple silicon 용 Machine Learning 프레임워크 - 04.LLM inference example

17 Dec 2023 » MLX: Apple silicon 용 Machine Learning 프레임워크 - 03.Multi-Layer Perceptron example

16 Dec 2023 » MLX: Apple silicon 용 Machine Learning 프레임워크 - 02.Regression example

15 Dec 2023 » MLX: Apple silicon 용 Machine Learning 프레임워크 - 01.Quick-start

03 Oct 2023 » Diary - Mac OS (Apple silicon M1) 에서 Ubuntu 사용하기

06 May 2023 » Project review - Knowledge distillation from powerful LLM, Alpaca and Koala

26 Apr 2023 » Diary - Ignore value in Cross Entropy function of PyTorch

22 Mar 2023 » OpenAI 뿌수기 - ChatGPT prompt design

21 Feb 2023 » MLOps study - Raviraja Week 6: CI/CD - GitHub Actions

02 Jan 2023 » Diary - 2022년 회고

29 Nov 2022 » Diary - Netron

22 Nov 2022 » Diary - Gym 0.26.0 update (truncted, no seed)

15 Nov 2022 » Diary - Dockerfile 한국어 설정

24 Oct 2022 » MLOps study - Raviraja Week 5: Docker

22 Oct 2022 » MLOps study - Raviraja Week 4: ONNX

17 Oct 2022 » MLOps study - Raviraja Week 3: DVC

13 Oct 2022 » MLOps study - Raviraja Week 2: Hydra

13 Oct 2022 » MLOps study - Raviraja Week 1: W&B

11 Oct 2022 » MLOps study - Raviraja Week 0: Pytorch Lightning

16 Aug 2022 » Diary - Sqaushed Gaussian policy for SAC

17 Jul 2022 » Diary - Network architecture for RL

25 Apr 2022 » Diary - Curious about GPU memory

09 Apr 2022 » Diary - Catastropic performance drop of off-policy RL methods

19 Mar 2022 » Diary - Frequent mistake for using torch.distributions.Normal

06 Mar 2022 » Paper review - Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

11 Feb 2022 » Diary - How to solve discrete-SAC loss explosion problem?

30 Jan 2022 » Paper review - Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback

29 Jan 2022 » Diary - How to use proper tense for writing academic paper?

23 Jan 2022 » Paper review - Interactive teaching strategies for agent training

17 Jan 2022 » Paper review - Learning latent representations to influence multi-agent interaction

10 Jan 2022 » Paper review - Learning trajectory preferences for manipulators via iterative improvement

01 Jan 2022 » Paper review - A survey of preference-based reinforcement learning methods