Semi-decentralized federated learning blends the conventional device-to-server (D2S) interaction structure of federated model training with localized device-to-device (D2D) communications. We study this architecture over practical edge networks with multiple D2D clusters modeled as time-varying and directed communication graphs. Our investigation results in an algorithm that controls the fundamental trade-off between (a) the rate of convergence of the model training process towards the global optimizer, and (b) the number of D2S transmissions required for global aggregation. Specifically, in our semi-decentralized methodology, D2D consensus updates are injected into the federated averaging framework based on column-stochastic weight matrices that encapsulate the connectivity within the clusters. To arrive at our algorithm, we show how the expected optimality gap in the current global model depends on the greatest two singular values of the weighted adjacency matrices (and hence on the densities) of the D2D clusters. We then derive tight bounds on these singular values in terms of the node degrees of the D2D clusters, and we use the resulting expressions to design a threshold on the number of clients required to participate in any given global aggregation round so as to ensure a desired convergence rate. Simulations performed on real-world datasets reveal that our connectivity-aware algorithm reduces the total communication cost required to reach a target accuracy significantly compared with baselines depending on the connectivity structure and the learning task.