Advanced Policy Optimization Algorithms with Flexible Trust Region Constraint