Reinforcement Learning Policy Gradient Methods For Reservoir Operation Management And Control