The class of multivariate Erlang mixtures with common scale parameter has many desirable properties and has widely been used in insurance loss modeling. The parameters of a multivariate Erlang mixture are normally estimated using an expectation–maximization (EM) algorithm as shown in Lee and Lin (2012) and Verbelen et al. (2016). However, when fitting the mixture to data of high dimension, the fitted density surface is often not smooth (with deep peaks and valleys) and the tail fitting may also be rather unsatisfactory. In this paper, we propose a generalized expectation conditional maximization (GECM) algorithm that maximizes a penalized likelihood with a proposed roughness penalty. The roughness penalty is based on integrated squared second derivative of the density function of aggregate data, which is used in functional data analysis. We illustrate the performance of the proposed method through some numerical experiments and real data applications.